Re: Optimizing sort-algorithm



alex wrote:
hi,
i'm trying to optimize execution time of bubblesort. it already runs
~10sec faster than the vanilla approach. But it still is ~12times
slower than the c-version. Is there nothing more to get out of?


For Lispworks, LWW 4.4

(defun bubblesort (x size)
(declare (type (simple-array fixnum (*)) x)
(type fixnum size)
(optimize (speed 3) (safety 0) (space 0) (debug 0)
#+lispworks(fixnum-safety 0)))
(dotimes (i size)
(dotimes (j (- size i))
(when (> (aref x j) (aref x (+ j 1)))
(rotatef (aref x (+ j 1)) (aref x j))))))

(defun main ()
(let* ((size 10000)
(x (make-array size :element-type 'fixnum)))
(dotimes (i size)
(setf (aref x i) (random 100)))
(time (bubblesort x size))))

Disassemble is your friend,

(Just a note I do not think it was proper to initialize the vector
with a fill pointer. simple-array is not allowed to have a fill-pointer)

From the hyper-spec,

http://www.lispworks.com/documentation/HyperSpec/Body/t_smp_ar.htm#simple-array

Type SIMPLE-ARRAY

Supertypes:

simple-array, array, t

Description:

The type of an array that is not displaced to another array, has no fill pointer, and is not expressly adjustable is a subtype of type simple-array. The concept of a simple array exists to allow the implementation to use a specialized representation and to allow the user to declare that certain values will always be simple arrays.



CL-USER 7 > (disassemble 'bubblesort)
206B8C22:
0: 55 push ebp
1: 89E5 move ebp, esp
3: 83EC0C sub esp, C
6: C70424450C0000 move [esp], C45
13: 50 push eax
14: 50 push eax
15: 50 push eax
16: 50 push eax
17: 50 push eax
18: 50 push eax
19: 50 push eax
20: 8B7D08 move edi, [ebp+8]
23: C745E000000000 move [ebp-20], 0
30: 837DDC00 cmp [ebp-24], 0
34: 0F8E00010000 jle L5
40: C745E400000000 move [ebp-1C], 0
47: 8B75DC move esi, [ebp-24]
50: 81EE00010000 sub esi, 100
56: 8975D8 move [ebp-28], esi
L1: 59: 8B5DE0 move ebx, [ebp-20]
62: 8B75DC move esi, [ebp-24]
65: 2BF3 sub esi, ebx
67: 8975E8 move [ebp-18], esi
70: 33DB xor ebx, ebx
72: 837DE800 cmp [ebp-18], 0
76: 0F8EB4000000 jle L4
82: 33D2 xor edx, edx
84: 8B75E8 move esi, [ebp-18]
87: 81EE00010000 sub esi, 100
93: 8975EC move [ebp-14], esi
L2: 96: 8D4F10 lea ecx, [edi+10]
99: 89D8 move eax, ebx
101: C1F806 sar eax, 6
104: 8B0C08 move ecx, [eax+ecx]
107: C1E108 shl ecx, 8
110: 8D8300010000 lea eax, [ebx+100]
116: 8D7710 lea esi, [edi+10]
119: 8975FC move [ebp-4], esi
122: C1F806 sar eax, 6
125: 8B75FC move esi, [ebp-4]
128: 8B0430 move eax, [eax+esi]
131: C1E008 shl eax, 8
134: 3BC8 cmp ecx, eax
136: 7E66 jle L3
138: 8D8B00010000 lea ecx, [ebx+100]
144: 8D4710 lea eax, [edi+10]
147: 894DFC move [ebp-4], ecx
150: C17DFC06 sar [ebp-4], 6
154: 8B75FC move esi, [ebp-4]
157: 8B0430 move eax, [eax+esi]
160: 8945F0 move [ebp-10], eax
163: C165F008 shl [ebp-10], 8
167: 8D4710 lea eax, [edi+10]
170: 895DFC move [ebp-4], ebx
173: C17DFC06 sar [ebp-4], 6
177: 8B75FC move esi, [ebp-4]
180: 8B0430 move eax, [eax+esi]
183: C1E008 shl eax, 8
186: 8D7710 lea esi, [edi+10]
189: 8975FC move [ebp-4], esi
192: 894DF8 move [ebp-8], ecx
195: C17DF806 sar [ebp-8], 6
199: C1F808 sar eax, 8
202: 8B75FC move esi, [ebp-4]
205: 0375F8 add esi, [ebp-8]
208: 8906 move [esi], eax
210: 8D7710 lea esi, [edi+10]
213: 8975F8 move [ebp-8], esi
216: 89D8 move eax, ebx
218: C1F806 sar eax, 6
221: 8B75F0 move esi, [ebp-10]
224: 8975FC move [ebp-4], esi
227: C17DFC08 sar [ebp-4], 8
231: 8B4DFC move ecx, [ebp-4]
234: 8B75F8 move esi, [ebp-8]
237: 890C30 move [eax+esi], ecx
L3: 240: 3B55EC cmp edx, [ebp-14]
243: 7C3D jl L6
245: 81C300010000 add ebx, 100
251: 8B4DE8 move ecx, [ebp-18]
254: 3BD9 cmp ebx, ecx
256: 0F8C5AFFFFFF jl L2
L4: 262: 8B5DE4 move ebx, [ebp-1C]
265: 3B5DD8 cmp ebx, [ebp-28]
268: 7C31 jl L7
270: 8B5DE0 move ebx, [ebp-20]
273: 81C300010000 add ebx, 100
279: 895DE0 move [ebp-20], ebx
282: 8B55E0 move edx, [ebp-20]
285: 8B5DDC move ebx, [ebp-24]
288: 3BD3 cmp edx, ebx
290: 0F8C13FFFFFF jl L1
L5: 296: FD std
297: B810000000 move eax, 10
302: C9 leave
303: C20400 ret 4
L6: 306: 81C200010000 add edx, 100
312: 89D3 move ebx, edx
314: E921FFFFFF jmp L2
L7: 319: 8B5DE4 move ebx, [ebp-1C]
322: 81C300010000 add ebx, 100
328: 895DE4 move [ebp-1C], ebx
331: 895DE0 move [ebp-20], ebx
334: E9E8FEFFFF jmp L1
339: 90 nop
340: 90 nop
341: 90 nop
NIL

CL-USER 8 > (main)
Timing the evaluation of (BUBBLESORT X SIZE)

user time = 1.542
system time = 0.000
Elapsed time = 0:00:01
Allocation = 2520 bytes standard / 3146 bytes conses
0 Page faults
NIL

CL-USER 9 >
.



Relevant Pages

  • Re: Sorting
    ... Then go back to the start of the array and do it again, until you make a clean sweep without swapping. ... This algorithm is called bubblesort. ... However I am not saying that you are wrong that, overall, selection sort might not be a better introductory algorithm than bubblesort. ...
    (comp.lang.c)
  • Re: Template problems.
    ... ISortHandle, so it knows about the array and its size. ... ISortHandle to the BubbleSort, so the BubbleSort also implicitly knows ... overwrite what was in the ISortHandle and then sort the whole thing. ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Help me with references
    ... >> efficiently than you can with bubble sort. ... > Bubblesort accomplishes this faster than insertion sort. ... your "almost sorted array" with one comparison. ...
    (comp.lang.java.help)
  • Re: Help me with references
    ... >> situation where the desired output is not actually a sorted array, ... > I suspect that even an "almost sorted array" could be produced more ... Bubblesort accomplishes this faster than insertion sort. ...
    (comp.lang.java.help)