Re: comparing sorts
- From: Franken Sense <frank@xxxxxxxxxxxxxxx>
- Date: Tue, 7 Apr 2009 13:19:16 -0700
In Dread Ink, the Grave Hand of Ron Shepard Did Inscribe:
In article <egfiwr24skb5.hgg4phq38mqa$.dlg@xxxxxxxxxx>,
Franken Sense <frank@xxxxxxxxxxxxxxx> wrote:
I don't think I've got my counters in the right place:
I think the bubble/insertion sort counters are right.
subroutine selection_sort(a)[...]
maxi = maxloc(a(1:last))
count4 = count4 + 1
This one isn't. It should be
count4 = count4 + (last-1)
The maxloc() function must do (last-1) comparisons in order to determine
its result. It might do them is some odd order, because of loop
unrolling or cache optimization or something, but in the end that is the
number of comparisons that must be done. If that operation had been
written out as a loop, instead of using the intrinsic function, the
operation count would have been clear.
Thanks, Ron, I can see that now. I think we need to add one though:
count2 = 0
count4 = 0
do last = size(a), 2, -1
maxi = maxloc(a(1:last))
count4 = count4 + last
if ( maxi(1) .ne. last ) then
count2 = count2 + 1
temp = a(maxi(1))
a(maxi(1)) = a(last)
a(last) = temp
endif
enddo
There's also the comparison in the following if statement. This makes the
number of comparisons for this set of data a scoch more for the sel sort.
(output below.)
number of comparisons in bubble is 499500
number of swaps in bubble sort is 499500
number of compares in sel sort is 500499
number of swaps in selection sort is 500
sort size is 1000
3082500 3082484 16
real seconds for bubble is 1.60000008E-02
3082500 3082500 0
real seconds for selection sort is 0.0000000
ratio is +Infinity
number of comparisons in bubble is 49960075
number of swaps in bubble sort is 49951100
number of compares in sel sort is 50004999
number of swaps in selection sort is 9487
sort size is 10000
3084078 3082500 1578
real seconds for bubble is 1.5780001
3084078 3084359 281
real seconds for selection sort is 0.28100002
ratio is 5.6156583
number of comparisons in bubble is 666222553
number of swaps in bubble sort is 666122682
number of compares in sel sort is 705082703
number of swaps in selection sort is 99935
sort size is 100000
3240437 3084375 156062
real seconds for bubble is 156.06201
3240437 3266296 25859
real seconds for selection sort is 25.859001
ratio is 6.0351138
If the time corresponded to these counters, you would think that the
selection sort were 4 orders of magnitude faster instead of about six times
faster for descending data.
Try also starting with vectors that are almost in the right order. For
example:
do i=1,r
myarray(i)= (i)
end do
! myarray(1) = (r); myarray(r) = 1
This puts the input array in sorted order. If you uncomment that last
line, then all but two elements are in their final locations. For these
kinds of arrays, bubble/insertion sort will be much faster than
selection sort (by a factor of O(N)).
So far, I've done three cases: random, ascending, and descending. I didn't
post anything about the ascending case, as I felt a little silly with all
zeroes as output.
A person would want to have a grab bag of distributions if he were testing
rigorously. I'll work up a few more, including one that is standard
normal, and some linear combinations of them.
One other comment. You mentioned in your previous post a "flat pdf". A
"pdf" is a probability distribution function, right? The timings for
these kinds of sorts do not depend on the distribution of values in the
array. The distribution of values is important for some sorts, but not
for these sorts. The ONLY thing that matters in these algorithms is the
initial order of the elements. The values themselves do not matter
beyond determining the order. The values might have a flat pdf, a
skewed pdf, or a pdf with multiple peaks, it doesn't matter. This is
not true for all sort algorithms, some of them actually do depend on the
distributions of the values of the array elements.
One thing about this distribution is if you have the elements in their
original places, one has a fifty percent chance of being greater than the
next. In ascending, it has zero percent chance. In descending, it has a
hundred percent chance. These distinctions matter to the efficacy of the
sort.
The performance of algorithms is often characterized by how they do in
their best case, their worst case, and their typical case. The best
case and worst case for the two algorithms we are discussing are when
the input arrays are in their correct order and in their reverse order,
respectively. This is good to know because in some applications you
have input arrays that are like those (or almost so), and you might pick
a wrong algorithm if you ignore that knowledge of your data. The
"typical" case is a little harder to understand. It is usually defined,
at least in principle, for these sorts as the arithmetic mean over all
possible input array orders. For an array of length N, there are N!
possible orders. So even for small values of N, it is not possible to
actually measure times for this large a number of sorting operations.
Some of those N! orders are going to be like the best-case orders, some
will be like the worst-case orders, and the rest will be somewhere in
between, so the timings from which the mean value is defined might span
a large range. Instead of computing all possible N! times, the best
that can be done empirically is to compute some statistical sample of
such array orders, time those, and estimate the behavior of the
algorithm from that sample. Theoretical analysis can be used to
estimate some of that information, and actual timings contribute the
rest of the information.
Also, when you compare timings, look at the ratios of times as a
function of N. If you increase N by a factor of 10, and the timings
increase by a factor of 100, then you know that you have an O(N^2)
algorithm. If the times increase by a factor of 10, then you know that
you have an O(N) algorithm. For the above arrays, you should see both
types of timings for these sorts. If the timings increase by a factor
that looks like (10+X), where X is small and possibly decreases a little
as N increases, then you probably have an O(N*LogN) algorithm.
$.02 -Ron Shepard
I added comparisons of times between orders of magnitude:
program sortdriver
implicit integer(a-h)
integer power1, power2, r
integer ival(8), time1(8), time2(8)
real, allocatable :: myarray(:), myarray2(:)
real u, float1, float2, ratio, ratio2, ratio3
power1=3
power2=5
! main control
! populate arrays in descending order
do j = power1, power2
r = 10**j
allocate(myarray(r))
allocate(myarray2(r))
do i=1,r
u = r - real(i)/real(r)
myarray(i)=u
end do
myarray2 = myarray
! print *, "myarray is ", myarray, myarray2
! time the sorts
call date_and_time(values=ival)
g = 60 * 1000 * ival(6) + 1000* ival(7) + ival(8)
call bubble_sort(myarray)
call date_and_time(values=ival)
h = 60 * 1000 * ival(6) + 1000* ival(7) + ival(8)
call selection_sort(myarray2)
call date_and_time(values=ival)
d = 60 * 1000 * ival(6) + 1000* ival(7) + ival(8)
! output
! print *, "sorted array is ", myarray
! print *, "second sorted array is ", myarray2
print *, "sort size is ", r
diff = h - g
time1(j)=diff
float1 = diff * .001
print *, h,g,diff
print *, "real seconds for bubble is ", float1
c = d-h
time2(j)=c
float2 = c * .001
print *, h, d ,c
print *, "real seconds for selection sort is ", float2
ratio=float1/float2
print *, "ratio is ", ratio
print *, " "
deallocate(myarray)
deallocate(myarray2)
end do !end main control
!compare sorts to themselves with successive orders of magnitude
do i = power2, power1+1, -1
ratio2 = real(time1(i))/real(time1(i-1))
print *, "power and ratio for bubble is", i, ratio2
end do
do i = power2, power1+1, -1
ratio3 = real(time2(i))/real(time2(i-1))
print *, "power and ratio for sel sort is", i, ratio3
end do
contains
subroutine bubble_sort(a)
! simple bubble sort.
implicit none
real, intent(inout) :: a(:)
real :: temp
integer :: start, bubble, count1, count3
count1 = 0
count3 = 0
do start = (size(a)-1), 1, -1
do bubble = start, (size(a)-1)
count3 = count3+1
if ( a(bubble) .le. a(bubble+1) ) exit
count1 = count1 + 1
temp = a(bubble+1)
a(bubble+1) = a(bubble)
a(bubble) = temp
enddo
enddo
print *, "number of comparisons in bubble is ", count3
print *, "number of swaps in bubble sort is ", count1
return
end subroutine bubble_sort
subroutine selection_sort(a)
! simple selection sort.
implicit none
real, intent(inout) :: a(:)
real :: temp
integer :: last, maxi(1), count2, count4
count2 = 0
count4 = 0
do last = size(a), 2, -1
maxi = maxloc(a(1:last))
count4 = count4 + last
if ( maxi(1) .ne. last ) then
count2 = count2 + 1
temp = a(maxi(1))
a(maxi(1)) = a(last)
a(last) = temp
endif
enddo
print *, "number of compares in sel sort is ", count4
print *, "number of swaps in selection sort is ", count2
return
end subroutine selection_sort
end program sortdriver
! gfortran sort11.f90 -Wall -o des.exe
! abridged output:
E:\gfortran\dan>gfortran sort11.f90 -Wall -o des.exe
E:\gfortran\dan>des
power and ratio for bubble is 5 99.912292
power and ratio for bubble is 5 104.13333
power and ratio for sel sort is 5 87.121216
power and ratio for sel sort is 5 18.562500
E:\gfortran\dan>des
power and ratio for bubble is 5 99.848366
power and ratio for bubble is 5 104.20000
power and ratio for sel sort is 5 92.024910
power and ratio for sel sort is 5 17.562500
[corrected power]
E:\gfortran\dan>gfortran sort11.f90 -Wall -o des.exe
E:\gfortran\dan>des
power and ratio for bubble is 5 98.898605
power and ratio for bubble is 4 98.625000
power and ratio for sel sort is 5 92.024910
power and ratio for sel sort is 4 +Infinity
E:\gfortran\dan>
The selection sort number for size 1000 is too small for me to measure
effectively, but all the other numbers look pretty close to a hundred.
--
Frank
Most of us here in the media are what I call infotainers...Rush Limbaugh is
what I call a disinfotainer. He entertains by spreading disinformation.
~~ Al Franken
.
- Follow-Ups:
- Re: comparing sorts
- From: Ron Shepard
- Re: comparing sorts
- References:
- comparing sorts
- From: Franken Sense
- Re: comparing sorts
- From: JB
- Re: comparing sorts
- From: Franken Sense
- Re: comparing sorts
- From: Ron Shepard
- Re: comparing sorts
- From: Franken Sense
- Re: comparing sorts
- From: Ron Shepard
- Re: comparing sorts
- From: Franken Sense
- Re: comparing sorts
- From: Ron Shepard
- comparing sorts
- Prev by Date: Re: comparing sorts
- Next by Date: Real Time Monitoring / Alarming
- Previous by thread: Re: comparing sorts
- Next by thread: Re: comparing sorts
- Index(es):
Relevant Pages
|