Re: Automatic parallelization - was Re: LISP Object Oriented?



Tim Bradshaw wrote:
On Jan 30, 5:59 pm, Paul Wallich <p...@xxxxxxxxx> wrote:

The big question is whether inefficient use of cores is a bad thing.

It's a bad thing because it's making poor use of expensive silicon.
The only valid (as opposed to marketing) reason to out more cores on a
chip is that it is (or is hoped to be) the best way of getting more
performance out of the resources (silicon in other words) available.
If the cores are starving for memory then a better use of the
resources would be to deal with that problem.

If you can. Once you get off-chip, the cost of increasing memory bandwidth can make the cost of putting another few cores on the die look like small change. So the question is whether a bit more cache is gong to do you better than more cores, and that's (imo) a tradeoff that should be looked at. If more cores or more functional units will buy you more speed, even though not as much more as you'd like, you may still be happy.

(I may have misspoken when I said "inefficient use of cores" -- by definition, inefficiency is worse than efficiency, but in many cases partial utilization of resources is going to be the best you can do. See, for example, the terrible inefficiency of RAM chips, where only a few transistors out of millions are doing anything useful on a given clock cycle.)

(This is similar to the reason why processors typically don't
speculatively execute both sides of a branch: the maximum possible
utilization of that is 50%, so it's generally better to predict and
then speculatively execute the branch you predict to be taken, which
can do much better than 50% on typical code.)

Absolutely. Then take the example someone gave of striping certain loops across cores rather than unrolling them -- the prolog and postlog sequences will almost certainly involve less-efficient utilization of cores than a single-core loop, but you'll still be getting better speed.

In the short run (if things can be properly handled) it seems that the kindness of strangers should (ha!) keep multiple cores occupied fairly well (currently 76 processes visible on this machine, albeit not all simultaneously contending for CPU). But in the longer run, yeah, we're going to need different notations and ways of handling parallelism at many different levels of granularity (duh!)

And applications that are computation-intensive but not embarassingly parallel will probably be socially ostracized.

paul
.



Relevant Pages

  • Re: processors of the future: super-computer-on-a-chip?
    ... " How many cores do you think a chip could have, lets say 10 or 20+ years from now? ... relegating single-threaded performance to the back seat of its POWER architecture: instead, after pioneering dual-core products 5 years ago it has been steadily improving their single-threaded performance. ... Sure, there will be a few applications that could make really good use of huge numbers of slower cores, but will they fund the associated development sufficiently to overcome the resources available to develop commodity products? ... [end quote] ...
    (comp.arch)
  • Re: How to develop a random number generation device
    ... chip, and something new will be required to manage them. ... I think that the number of virtual cores will grow faster than the ... One CPU would be the manager, ... I'm happy to accept that doing things in hardware is often more reliable than doing things in software (I work with small embedded systems - I know when reliability is important, and I know about achieving it in practical systems). ...
    (sci.electronics.design)
  • Re: How to develop a random number generation device
    ... chip, and something new will be required to manage them. ... I think that the number of virtual cores will grow faster than the ... One CPU would be the manager, ... embedded systems - I know when reliability is important, ...
    (sci.electronics.design)
  • Re: Target market for Intellasys.
    ... I was wrong about that Ambarella chip, it's average power requirements are more than I thought. ... With the 1 transistor dram, the substrate acts as a capacitor, so theoretically you get many times more memory density, good speed etc. ... I for one would be dropping in 10+DACS, extra processors, extra memory, and if available 36bit processor cores and full external SRAM memory buss mapped to one core. ... But such a scheme would allow customers to easily order a module populated with a desired amount of memory cores, and it would cost intellasys a lot less than putting memory on the processor. ...
    (comp.lang.forth)
  • Re: processors of the future: super-computer-on-a-chip?
    ... " How many cores do you think a chip could have, lets say 10 or 20+ years from now? ... Just because it's become harder to improve single-thread performance doesn't mean that it's no longer useful to and that taking the path of least hardware resistance is The Right Thing To Do. ... In fact, one could argue that because single-threaded operation characterizes such a large percentage of today's applications, and because software has historically changed so slowly compared with hardware, then there's relatively little reason to push multiple cores per chip beyond *at most* a few dozen for the immediate future while continuing to devote significant concentration to improving single-thread performance too. ...
    (comp.arch)