Meaning of in-place; was fast stable sort



On 20 Nov 2008 01:25:19 -0800, Tim Rentsch
<txr@xxxxxxxxxxxxxxxxxxx> wrote:

cri@xxxxxxxx (Richard Harter) writes:

On 19 Nov 2008 08:16:33 -0800, Tim Rentsch
<txr@xxxxxxxxxxxxxxxxxxx> wrote:

cri@xxxxxxxx (Richard Harter) writes:

On 18 Nov 2008 07:08:47 -0800, Tim Rentsch
<txr@xxxxxxxxxxxxxxxxxxx> wrote:

cri@xxxxxxxx (Richard Harter) writes:


That said I will argue that the
recursion space doesn't count when one is talking about "in
place". It seems to me that the essence of "in place" is that
the data does not move out of collective space that it originally
occupied.

[snip]
[snip not particularly relevant prolog]


So my position, which is a minority position, is that the "formal
definition" is a poor choice of things to include and ignore. My
view is that the "in-place" should refer only to the data element
space, i.e, no more than a fixed number of additional data
elements are used. Recursion should be an additional qualifier,
e.g., quicksort is a recursive in-place algorithm.

A problem is that many algorithms that are not in-place can be turned
into in-place algorithms with only a small amount of additional
storage -- N bits, sqrt(N) bits, sometime even just log N bits (and I
mean single bits here!) If only data elements count, then lots of
these distinctions disappear. All sorting algorithms, for example,
would be in-place, since we could simply make an array of indices and
manipulate the indices rather than the elements (and having indices
would also allow them all to be stable). The costs of other kinds of
variables (bits, integers, pointers, indices) really should be
non-zero, even if they aren't all measured in bits, nor necessarily
dependent on N.

The distinctions don't disappear; they are just recategorized.
Thus your sorting algorithm that uses indices would be in-place
but use O(n) additional space. For that matter it wouldn't be
the same algorithm as the one that doesn't use indices. In fact,
unless there is a step to permute the data, it doesn't even do
the same thing.

I suggest that the formal definition ignores an important
distinction, that between data element space and space used by
the algorithm itself. The distinction is significant because
data element size is an independent variable; it can be
indefinitely large, regardless of the number of data elements.

By the space used by the algorithm, I mean all space used by the
algorithm other than data element space, e.g., stack space,
indices, etc. The important thing here is that for purposes of
analysis this space usage is independent of the data element
size.

The issue then is that we have two kinds of space usage and the
formal definition subsumes them into one by making an implicit
assumption that the data element size is O(1). I opine that this
is an important distinction, one that should be observed in the
nomenclature. I submit that reserving the term "in-place" for
the data element space usage is how the word is used in practice
and is how it should be used.

To take sorting algorithms as an example, saying that quicksort
is in-place and mergesort is not captures a significant fact
about the two algorithms; whereas saying that neither is in-place
glosses over that significant fact.

Trying to factor out the cost of recursion is problematic because
recursion could be used to hide the cost of other items. Quicksort
isn't just recursive, it's recursive /with paramaters/ (as I suppose
all useful recursive functions must be). If recursion with parameters
doesn't count, then a standard merge sort could be done by first
recursing to a depth of N, with a linked list going through the stack
frames, and sorting that; indices in the stacked list elements could
be used to do the final permutation on the argument array.

I disagree that it is problematic. The algorithm you give is
in-place with O(n) additional (algorithm) space. As a side note
distinguishing between algorithm variants is desirable.
Mergesort, for example, is really a family of algorithms with a
surprising variety of properties.




Of course, in practice, terms like "in-place" and the O()
notation are informally used in quite surprising (and
distressing) ways.

They are, and that's part of why I'm presenting the point of view that
I am. We don't have to be dogmatic and insist that "in-place" has
only one meaning; but if we require that the various different
meanings are each based on a well-defined computational model then at
least the question of what the model allows could be answered
unambiguously. Then we can get on to the important arguments
about which model is better. (1/2 :)

I agree. However the choice of terminology is important; it
reflects the choices we make about what is noticed and what is
ignored.


Richard Harter, cri@xxxxxxxx
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
.



Relevant Pages

  • Re: Meaning of in-place; was fast stable sort
    ... view is that the "in-place" should refer only to the data element ... Recursion should be an additional qualifier, ... The distinctions don't disappear; they are just recategorized. ... Thus your sorting algorithm that uses indices would be in-place ...
    (comp.programming)
  • Re: Meaning of in-place; was fast stable sort
    ... that between data element space and space used by ... The distinction is significant because ... will use by implementing an algorithm. ...
    (comp.programming)
  • Re: coerce for arbitrary types
    ... the algorithm itself is essentially last-in first-out, ... via some sort of recursion? ... WaitStack <= new stack containing topnode ... As nested lists, there's only one number in a proper position. ...
    (comp.lang.lisp)
  • Re: coerce for arbitrary types
    ... algorithm that Cormen, Leiserson, and Rivest use in their DFS. ... inherently first-in first-out, like a queue, not at all stack-like ... experience, once you get over the hurdle of recursion, ... *intention* is nested lists, *not* arbitrary CONS trees. ...
    (comp.lang.lisp)
  • Re: Software bugs arent inevitable
    ... I suggested that for a given algorithm implemented ... iteration should always be faster by a small factor. ... > recursive function can run out of stack space while the heap still has ... Which is why, in the part you snipped, I said that recursion (unlike ...
    (comp.lang.python)