Re: Interview with Alan Kay

From: Duane Rettig (duane_at_franz.com)
Date: 02/28/05


Date: 28 Feb 2005 01:02:55 -0800

Dave Roberts <dave-remove@remove-findinglisp.com> writes:

> Duane Rettig <duane@franz.com> writes:
>
> > Dave Roberts <dave-remove@remove-findinglisp.com> writes:
> >
> > > Duane Rettig <duane@franz.com> writes:
> >
> > > > For GC support, the real problem to solve is fast forwarding of
> > > > pointers that have moved.
> > >
> > > If you could design GC from scratch, would you do a copying collector?
> > > They are popular with current systems, particularly generational GC
> > > systems, but there is clearly overhead there.
> >
> > Well, Generational GCs are not garbage collectors, they are actually
> > garbage _leavers_. They collect live data, and leave the garbage.
> > They work efficiently because of the actual tendency of most garbage
> > to become unreferenced early in its lifetime. Therefore, a higher
> > percentage of garbage is present in a first generation than in any
> > successive generation, and thus collecting the live data is going
> > to be more efficient than collecting the garbage and leaving the
> > live data uncopied. [Another nicety of collecting live data is that
> > the "marking" of the data a live no longer needs a bit to be set
> > for that data; if the data has moved, then that also becomes the
> > indicator that the data is live.] Once the number of generations
> > is high enough, it no longer is as useful to collect data, and
> > perhaps it becomes less efficient to continue to move live data
> > around at that time (though it is still easier to identify live
> > data than dead, so live-data-collection might still be worthwhile
> > even in older generations, though there are these tradeoffs). So
> > yes, when I talk about forwarding, I am assuming a copying portion
> > of a garbage collector, probably in a very young generation.
>
> I think you missed the root of my question.

No, I didn't, but in order to really answer it, I would have to
provide a lot of background. OK, you asked for it...

> I understand the theory of
> generational GCs. I understand why copying collectors are typically
> used for young generations. But those collectors are built for
> existing machines with a certain set of assumptions behind them. Some
> of those assumptions are about the nature of memory allocation (the
> young typically die quickly) and some are about the nature of the
> standard machines on which these operate (memory access speeds,
> paging, etc.). Obviously, the first set of assumptions about the
> behavior of garbage is fixed given the language in which we work.

Agreed.

> The question is really about changing the assumptions of the second
> set.

Yes, and my caution is that we want not to do that. In a previous
career I was a hardware engineer, and I worked in test engineering
labs. We built special-purpose equipment for specialized testing
strategies, and yet there was always the requirement that we keep
costs down and reuse high as much as possible - the problem with
not doing so was not that it was too expensive, but that we would
miss dates for production startups. Thus, even in such highly
specialized situations, we tried as much as possible to stay with
a mainstream of design thought, geared toward the high reuse of
older designs _and_ the brainstorming of potential future uses.
Needless to say that this thinking led to the incorporation of
a lot of software into this design process. And as little as
we could change hardware, the decisions were usually weighted
toward those choices.

When I started working on Lisp, I joined a Company that believed in
getting the most out of General Purpose hardware, and I joined
it for a reason, so it should be no surprise that I advocate for
a more GP solution, or taking advantage of hardware that can
also be used generally; thus increasing its chances of that
hardware surviving a generation or two of redesigns.

> If we could make a machine do what we *want* rather than what
> some CPU designer thought was the right thing, could we do better? If
> so, would we still use the same algorithms, or would we shift to a new
> optimum point on the curve?

Very likely so, but if it did not also have a local optimum for
non-gc languages, then it would be likely to die the same death
that most special-purose hardware eventually dies.

Now, I believe that my view on what constitutes GP hardware might be
surprising, so I will state it simply - it is not the architecture that
makes a computer General Purpose, but its survivability. Consider,
for example, your own computer; it is actually a special-purpose box
that has a certain set of hardware that runs on it at a certain speed
(likely it is as fast as it can reasonably be clocked), and even if you
just bought it recently, chances are it will be obsolete in 5 years.
Why? Because you will be buying other boxes (perhaps even with a
different architecture) by then, because your nice, crisp box will
seem so painfully slow by then. And computer manufacturers keep making
new machines, so the older ones will continue to grow obsolete.
But each generation of architecture has many points in common with its
predecessors, so there is a continuum along which buyers of these
computers are willing to travel. That is my GP hardware philosophy.

So how does a new feature make it into an architecture _and_ stay? It
must be in demand by a broad range of users, at least enough to sustain
a critical mass of buyers. Otherwise, costs of adding the extra usage to
the real-estate of the chip, and the extra design and debug time, make
it less likely that a feature would get in.

Now, one thing that we GC advocates now have going for us is the
newfound acceptability in the marketplace of the concept of
garbage-collection. It is no longer a dirty word, although there
are still many different gc concepts and theories about how the
colorations can be done either in software or sped up in hardware.
So whether a hardware-based approach will fly (i.e. be incorporated
into silicon and last) depends on how general the concept can be
made, and how much gain it can give to all languages for an appropriate
amount of cost.

So, finally, we come to the answer to your question. But, (and sorry
about this :-) instead of just giving you an answer, which would only
be an opinion of mine, let me turn your question on its ear and let
me answer you by asking you a question: Given that you want to remove
the copying of objects by a generational gc, what steps might you take
in hardware to maintain the generational nature of the gc, and yet to
provide the information necessary to perform the garbage collection?

> > > > This would allow
> > > > memory reads and stores to be indirected only when the page is being
> > > > cleaned out, as opposed to a true indirecting memory set and reference
> > > > technology that would _always_ go through an extra indirection; that
> > > > would work fine enough, but memory reads would be more than twice
> > > > as slow (not only two memory references, but second one being dependent
> > > > on the first, so the pipeline would stall often). This slowdown can be
> > > > tolerated if it is only required on pages that are being worked on by
> > > > the gc, but not if it is the basic operation for every memory read and
> > > > write.
> > >
> > > Ah, I think I see. So you set a bit that says that this page is in
> > > old-space.
> >
> > Not quite. You set a bit that says that this page is one which may
> > have forwarded pointers. Where the page is is irrelevant, though
> > it is actually more likely to be in a newspace than an oldspace.
>
> Okay, right. That's basically what I meant. I was making the jump that
> this would be most useful for old pages.

I would think the opposite; the more likely place for a forwarding
pointer is to a newly moved object, which is most optimally in objects
just moved from the first to the second generation (and less and less
likely in generations after that).

-- 
Duane Rettig    duane@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   


Relevant Pages

  • Re: Best Ref-counting algorithms?
    ... You're making a semantic mistake in equating "tracing" with "tracing ... actual work of reclaiming garbage blocks is generically known as ... collectors, tracing is just the first step in performing a collection. ... Then you scan the heap looking ...
    (comp.compilers)
  • Re: Location
    ... People can tolerate piling up garbage. ... Sewer line workers are looked down upon yet are the ... On the other hand, if the garbage collectors skipped a week, we'd have ... The society which scorns excellence in plumbing because plumbing ...
    (alt.usage.english)
  • Re: Gratuities
    ... there are so many different refuse collectors ... Our garbage gets collected well before I rise in the morning. ... Early to bed and early to rise, at least before the bin man arrives, is ...
    (alt.usage.english)
  • Re: Location
    ... People can tolerate piling up garbage. ... Sewer line workers are looked down upon yet are the ... On the other hand, if the garbage collectors skipped a week, we'd have ... The society which scorns excellence in plumbing because plumbing ...
    (alt.usage.english)
  • Re: some newbie questions
    ... It is a rapid prototyping tool, intended to support development ... when the hardware itself may be in FPGA state or worse. ... In this sense you can call the FPGAs "piece of garbage" - compared ... But if the OP already got the hardware in a good shape, they have no need to use crooks. ...
    (microsoft.public.development.device.drivers)