Re: sort functions in python



Steven D'Aprano wrote:
On Fri, 08 Feb 2008 19:09:06 -0800, Jeff Schwab wrote:

Steven D'Aprano wrote:
On Fri, 08 Feb 2008 17:00:27 -0800, t3chn0n3rd wrote:

Do you think it is relatively easy to write sort algorithms such as
the common Bubble sort in Python as compared to other high level
programming langauges

You realise that bubble sort is one of the worst possible sort
algorithms possible, particularly on modern hardware? It's not the
worst: bogo sort and bozo sort are worse, but bubble sort is still
pretty atrocious.
That's the conventional wisdom, mostly because CS professors need a "bad
algorithm" to show undergrads, but it's not really accurate. The truth
is that bubble-sort's performance depends strongly on the input data. In
the worst case, yes, bubble-sort is O(n^2); however, for almost-sorted
data (only a few elements out of place), bubble-sort is actually one of
the fastest known algorithms.

It depends on what you mean by "bubble sort". There are many different variations of bubble sort, that are sometimes described by names such as comb sort, cocktail sort, exchange sort, and sometimes merely referred to bubble sort. It's rather like any divide-and-conquer sorting algorithm being called quicksort.

I'm as guilty of that as anyone else -- the example code I posted, raided from Wikibooks is very different from this bubble sort algorithm from PlanetMath:

http://planetmath.org/encyclopedia/Bubblesort.html

def bubblesort(L):
done = False
while not done:
done = True
for i in xrange(len(L)-1):
if L[i+1] <= L[i]:
L[i], L[i+1] = L[i+1], L[i]
done = False
return L

This one runs significantly faster than the earlier one I posted.

Another vital factor is, what language are you implementing the sort routine in? The conventional wisdom for low-level languages like assembly and C doesn't necessarily hold for high-level languages like Python. Comparisons are slow in Python and fast in C; in C a good algorithm will minimize the amount of moves, while in Python a good algorithm will minimize the number of comparisons.

Finally, what hardware are you running on? There are interactions between hardware caches and pipes that can ruin the performance of an otherwise promising algorithm.


But all those factors aside, I fear that your attempt to rehabilitate the reputation of bubble sort is misplaced. Here's another person who wants to defend bubble sort:

"A fair number of algorithm purists (which means they've probably never written software for a living) claim that the bubble sort should never be used for any reason. Realistically, there isn't a noticeable performance difference between the various sorts for 100 items or less, and the simplicity of the bubble sort makes it attractive."

http://linux.wku.edu/~lamonml/algor/sort/bubble.html

But then on the same person's page on insertion sort:

"The insertion sort is a good middle-of-the-road choice for sorting lists of a few thousand items or less. The algorithm is significantly simpler than the shell sort, with only a small trade-off in efficiency. At the same time, the insertion sort is over twice as fast as the bubble sort and almost 40% faster than the selection sort."

http://linux.wku.edu/~lamonml/algor/sort/insertion.html

He gives sample C code for both, and the bubble sort has eight lines of code, versus nine for insertion sort (excluding bare braces).

Perhaps what he should have said is:

"Bubble sort is a tiny bit simpler than insertion sort, and almost twice as slow!"

So basically, you and I agree that the right sorting algorithm for the job depends on the job. I have no particular interest in evangelizing for bubble-sort; however, I hate to see an algorithm (or data structure, or language, or programming style) taken out of the running before the race even starts, just because it's out of fashion.
.


Quantcast