Re: Quicksort




<robertwessel2@xxxxxxxxx> wrote in message news:68b3fb1f-3b96-4f63-97d3-640a56bbd50c@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Jan 8, 8:12 pm, "cr88192" <cr88...@xxxxxxxxxxx> wrote:
oh, this shows, something...
...
...

how about we stick to actual code (exact lang doesn't matter so much
probably...).
then we can actually debate about what will actually happen, rather than
pcode written in forms that will lead to little more than conjecture and
idle speculation wrt the underlying implementation...

abstact for abstract, concrete for concrete.
not so sensible to use the abstract and then debate about points in the
concrete.


Of course, the particular point being addressed was about the
algorithm, not a particular implementation, but whatever...


yes, ok.



now, IMO, the points of debate are this:
quicksort, in a purely recursive form, likes to overflow if nothing is done;
quicksort, in some edge cases, gets bad performance;
quicksort, in the average case, is one of the fastest sort algos.

highest priority here: overflows.
not good to have the app crash, after all.

so, my approach: on overflow fall back to a different sort (typically a
simple but slow sort).


(...options snipped...)

While a fallback to a O(n**2) sort is not an unreasonable way to avoid
the excessive stack growth, just fixing the implementation of
quicksort to avoid the excessive stack growth is almost certainly
simpler - that's one of my major points.


yes, but, if the "fix" is simply gaining an O(n^2) worst case (rather than an O(n^2) fallback), it is not much of a tradeoff.

also, I will argue, at least for a selection sort fallback, that the added complexity is marginal, even vs rewriting the thing into a loop.

justification:
selsort just isn't that complicated.

of course, cocktail sort is more complicated, and experimentally (for randomly generated sets) is comming out generally slower than selection sort.

Also, I'd argue that an unexpected 50000x increase in runtime may well
be equivalent to having the application crash (the approximate factor
if a quicksort of a million elements goes quadratic).


yes, however, large sorts like this are rare IME.
I use lots of small sorts (maybe hundreds or a few k items), but large sorts are, over all, fairly rare...

I use sorts much as I use hashes. they are very common, for lots of little things, but typically small volume, and not very reusable (often, they are implemented and specialized for each and every use case).


<snip>


so, if a 15x slowdown exists, the O(n^2) algo may well be faster.

much smaller, and it may well be faster to just use something like sel-sort.


While that's theoretically true, it's also fairly far divorced from
reality. Typical implementations of heapsort have an average runtime
of about double that of quicksort, and in the case of heapsort, that's
also approximately the maximum run time.


yes.

I was actually debating more about providing a generic library-style wrapper, of the kind found in qsort.

for example, the compiler is fairly smart, and it can do useful things for moving integers around.
memcpy is not as smart.

test shows, however, that qsort is only about 1.6x slower than a plain natively-compiled sort (with compiler optimizations on), and is actually faster than a native quicksort with optimizations turned off...

so: I think the C runtime is compiled with optimizations turned on...

then again, it is possible that either qsort is making use of a special case for 4-byte items as well (this would make sense, since these are statistically likely to be very common, ints, pointers, and pointers to things...).


with compiler optimizations, qsort is still about 1.3x faster than heapsort.
as such, qsort is probably implemented as a quicksort variant, or something else similarly fast.


In any event, for 1000 elements, this is all a pretty much pointless
debate. If you only got a few thousand elements to sort, use
Shellsort - it's simpler, and at approximate O(n**1.5) worst case is
almost always good enough for such small data sets. You might even
want to consider that for the quicksort fallback.


possibly.

then again, lots of small sorts add up (much like linear searches).

for example, if one is sorting many thousands of sets of items, each with a few hundred items, speed still matters.


likewise, even though most lists are smallish, not doing any hashing can still kill an apps' performance, as all these masses of linked-list lookups accumulate and choke the app...

actually, this happened unexpectedly long ago, at a time where I was unaware of the dangers of masses of linear lookups...

sadly, my current new upper compiler (this one being built on DOM trees, unlike the previous which was built on Lisp-style lists) faces pretty much the same problem (though, since it is not fully written yet, I have little idea what to expect for compile times).


But your point that performance needs to be "good enough" is correct -
which is why heapsort is very attractive. It's only marginally slower
than quick sort, and will never bite you in the posterior by going
quadratic. IOW, if it's good enough, it will alwyas be good enough,
no surprised - the same cannot be said of quicksort, even though it
may be a bit better on average.


yeah, I guess so. I guess one tends not to worry too much about the worst case, given how rare it is.

I guess, in cases where it becomming unstable and going "quadratic" is a serious operational risk, introspective sort is probably a good idea.

after all, if one has both quicksort and heapsort at hand, beating together introsort is pretty easy.


then again, introsort is pretty bulky, so, likely, it would make sense to implement all this behind a vaguely-generic qsort-style interface, and use the amazing powers of copy and paste.


or, I guess, one can just use qsort, and assume that the runtime implementors have done a good enough job.


.


Quantcast