Re: C Programmer Needed



CTips wrote:
> websnarf@xxxxxxxxx wrote:
> > CTips wrote:
> >>websnarf@xxxxxxxxx wrote:
> <snip: discussion of multithreading & C>
> >>The second approach is to extend the language to allow the programmer
> >>more fine-grain control. We start off by assuming pthreads (or some
> >>equivalent) and extending it. Some of the features which end up being
> >>needed:
> >>1. thread-private globals: note that this is not just an issue with the
> >>compiler; try expressing thread-private globals in ecoff.
> >
> > Microsoft implements this as a function call named something like
> > "GetThreadLocal...". If you retain well defined stack frames (which is
> > required for proper exception handling in C++), you can just crawl up
> > them to reach the top of the stack, and you can put your thread local
> > pointer there. But in general, a function like:
> >
> > void ** getThreadLocalBase (void);
> >
> > could be defined to give us what we desire. (So each thread would be
> > responsible to setting and maintaining its thread local base pointer.)
>
> The issue is how does one define
> int x; /* file scope */
> and have a unique copy per thread.
>
> There are various approaches that use explicit on-stack or in-heap
> allocation of objects, whose addresses are captured on-stack and then
> passed along. So, we can do:
>
> thread_main()
> {
> int * x_heap = malloc(sizeof(int));
> int x_
> int * x_stack = &x_;
>
> foo(x_heap, x_stack, ...);
> }

Well with this work around you are explicitely passing the thread-local
context around. This is syntactically cumbersome. Having a function
like getThreadLocalBase finds your way back to an implicitely declared
void * at the top of your thread's stack -- so it becomes up to the
programmer to hoist out that call (since it would be somewhat
expensive) but it would lead to simpler code.

> and so on, but that is not a thread private global.

Well how do you expect the implementation of the thread private global
to work anyways? The thread needs to indirectly address, basically, a
base-of-stack-indexed data area no matter what. In which case you
don't gain very much by trying to do this with syntax, and lots of
semantic explanation. By providing an actual function like
getThreadLocalBase, or forcing you to do things the way you suggest,
the programmer understands it exactly without having to learn any new
language concept.

> Not having thread private globals is not *that* big a problem, but the
> work-arounds can be quite painful. More importantly, however, is the
> fact that it makes optimizations somewhat harder.

Exactly what optimizations did you have in mind? I.e., do you have a
way of removing the address indirection?

> >>2. atomic operations: a mutex is a heavy weight way of trying to ensure
> >>atomicity in some cases. Consider atomic increment (or atomic push or
> >>atomic set bit or...). Its implementation on many machines is relatively
> >>cheap. However, expressing them in pthreads, e.g., is pretty clunky.
> >
> > Yeah, so there should be a library of functions like:
> >
> > int32_t atomicAddRead32 (int32_t * pCounter, int32_t delta);
> >
> > as so on. If your hardware doesn't have special support for such
> > operations, then you get to implement them from other simpler
> > primitives.
>
> In which case the standard libary has to define lots of behaviors, plus
> all the possible trade-offs that are possible. [Do you want a mechanism
> that works well under low-contention? under high-contention? Or should
> the library provide both possible functions? Do you want an
> implmenetation optimized for running on a single processor machine, or
> on an SMP, or multiple-processors on a core?]

Ok, but you are taking about implementations versus specification here.
Keep in mind that many of these threading primitives are
re-expressible in terms of each other. So if you simply supply a large
enough API, you should be able to capture the sweet spot of any given
implementation, through some appropriate subset of them. It would
certainly be no worse than what the ANSI committee did to early
microprocessors when they insisted that floating point would be in the
language.

> Also, this does not always express the intent - are you using this
> atomic increment to implement a mutex? Or truly just a counter?

A mutex usually includes implicit blocking which cannot be achieved by
a counter alone. You should implement mutexes seperately (you would
not want to spin-lock in a single processor environment, for example,
and you still want to give away your time slice in SMP configurations).

> Instead of telling the processor what to do, it is sometimes better to
> tell the compiler what the intention is, and let it pick the
> implementation. While a library based approach does not necessarily
> inhibit this, it would be a cleaner solution to extend the language.

How would you propose to extend the language?

> >>3. sharing+rendevous/barrier: consider an array that is used for double
> >>buffering messages between two threads. Every time we start processing a
> >>message, we need to use the new values in the array. One way of telling
> >>the compiler this is to mark the array as volatile. However, this is too
> >>heavy weight. It inhibits a lot of optimizations in the compiler.
> >>Instead, we have to come up with a mechanism that tells the compiler "at
> >>this point, assume all old values from this array are dead". Note that
> >>this is a statement about a specific variable at a specific point in the
> >>code.
> >
> > Usually you need a token or flag of some sort to allow this to work.
> > So in fact, all you need is to be able to declare some parts of a
> > struct volatile. Can't you do this today?
>
> No. Consider the following:
>
> struct {
> volatile int to;
> int msg;
> } comm;
>
> ...
> comm.msg = 10;
> comm.to = YOU;
>
> while( comm.to != ME ) {
> }
>
> printf("msg from YOU: %d\n", comm.msg);
>
> The compiler is free to do constant propagation on the non-volatile
> portions, and have the program always print
> msg from YOU: 10

I see. So what you want is something like

barrier comm.msg;

Before the printf line, where barrier would block variable motion,
hoisting or constant propogation wrt to the variable across that
barrier.

> >>To tie these together, we need to define the semantics of C using a
> >>*parallel* virtual machine. I don't think that this has been done.
> >
> > Well, perhaps not a virtual machine, but in the HPC/Supercomputing
> > arena, I think "MPI" (http://www-unix.mcs.anl.gov/mpi/) was designed
> > precisely for this sort of thing.
>
> It assumes processes. Its pretty heavy weight. It assumes that the
> compiler will not co-optimize all the processes, and that using
> functions will make things opaque. It is not possible (IMHO) to
> implement in C an MPI library/program that will not break [this assumes
> a whole program (including the MPI libraries) compilation using a smart
> enough compiler].

I see.

> We've run into this problem in several different flavors, which arise
> because we are using a very aggressive whole program compiler and doing
> whole program compilation, including the libraries. Note that the source
> code of library does not have to available; it is possible to have "fat"
> libraries that contain a IL representation of the library functions
> suitable for further optimization (including inlining).

So do can you point to a proposal or something that actually addresses
these problems? Is Java a sufficient answer for example?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

.



Relevant Pages

  • Re: Teaching new tricks to an old dog (C++ -->Ada)
    ... > no time there was a fully compiant C compiler available. ... Let's address the Ada side first. ... compiler for C, Ada, or any other language. ... Ada certainly provides facilities useful for writing libraries, ...
    (comp.lang.ada)
  • Re: Teaching new tricks to an old dog (C++ -->Ada)
    ... > no time there was a fully compiant C compiler available. ... Let's address the Ada side first. ... compiler for C, Ada, or any other language. ... Ada certainly provides facilities useful for writing libraries, ...
    (comp.lang.cpp)
  • Re: OT: Requesting C advice
    ... language designers made the language ... Since when is C a virtual machine language? ... Every compiler I've used compiles C to native machine code for the ... libraries was not pleasant. ...
    (Fedora)
  • Re: And now my thoughts on Delphis survival
    ... Hopefully the next language ... BCB was previously seeing active development, ... CBX was a great product, but it was IDE focussed, not a compiler. ... already know how to handle in libraries, ...
    (borland.public.delphi.non-technical)
  • Re: OT: Requesting C advice
    ... language designers made the language ... Every compiler I've used compiles C to native machine code for the ... libraries was not pleasant. ...
    (Fedora)