Re: Optimizing sort-algorithm
- From: Wade Humeniuk <whumeniu+anti+spam@xxxxxxxxx>
- Date: Tue, 27 Feb 2007 14:06:54 GMT
alex wrote:
hi,
i'm trying to optimize execution time of bubblesort. it already runs
~10sec faster than the vanilla approach. But it still is ~12times
slower than the c-version. Is there nothing more to get out of?
For Lispworks, LWW 4.4
(defun bubblesort (x size)
(declare (type (simple-array fixnum (*)) x)
(type fixnum size)
(optimize (speed 3) (safety 0) (space 0) (debug 0)
#+lispworks(fixnum-safety 0)))
(dotimes (i size)
(dotimes (j (- size i))
(when (> (aref x j) (aref x (+ j 1)))
(rotatef (aref x (+ j 1)) (aref x j))))))
(defun main ()
(let* ((size 10000)
(x (make-array size :element-type 'fixnum)))
(dotimes (i size)
(setf (aref x i) (random 100)))
(time (bubblesort x size))))
Disassemble is your friend,
(Just a note I do not think it was proper to initialize the vector
with a fill pointer. simple-array is not allowed to have a fill-pointer)
From the hyper-spec,
http://www.lispworks.com/documentation/HyperSpec/Body/t_smp_ar.htm#simple-array
Type SIMPLE-ARRAY
Supertypes:
simple-array, array, t
Description:
The type of an array that is not displaced to another array, has no fill pointer, and is not expressly adjustable is a subtype of type simple-array. The concept of a simple array exists to allow the implementation to use a specialized representation and to allow the user to declare that certain values will always be simple arrays.
CL-USER 7 > (disassemble 'bubblesort)
206B8C22:
0: 55 push ebp
1: 89E5 move ebp, esp
3: 83EC0C sub esp, C
6: C70424450C0000 move [esp], C45
13: 50 push eax
14: 50 push eax
15: 50 push eax
16: 50 push eax
17: 50 push eax
18: 50 push eax
19: 50 push eax
20: 8B7D08 move edi, [ebp+8]
23: C745E000000000 move [ebp-20], 0
30: 837DDC00 cmp [ebp-24], 0
34: 0F8E00010000 jle L5
40: C745E400000000 move [ebp-1C], 0
47: 8B75DC move esi, [ebp-24]
50: 81EE00010000 sub esi, 100
56: 8975D8 move [ebp-28], esi
L1: 59: 8B5DE0 move ebx, [ebp-20]
62: 8B75DC move esi, [ebp-24]
65: 2BF3 sub esi, ebx
67: 8975E8 move [ebp-18], esi
70: 33DB xor ebx, ebx
72: 837DE800 cmp [ebp-18], 0
76: 0F8EB4000000 jle L4
82: 33D2 xor edx, edx
84: 8B75E8 move esi, [ebp-18]
87: 81EE00010000 sub esi, 100
93: 8975EC move [ebp-14], esi
L2: 96: 8D4F10 lea ecx, [edi+10]
99: 89D8 move eax, ebx
101: C1F806 sar eax, 6
104: 8B0C08 move ecx, [eax+ecx]
107: C1E108 shl ecx, 8
110: 8D8300010000 lea eax, [ebx+100]
116: 8D7710 lea esi, [edi+10]
119: 8975FC move [ebp-4], esi
122: C1F806 sar eax, 6
125: 8B75FC move esi, [ebp-4]
128: 8B0430 move eax, [eax+esi]
131: C1E008 shl eax, 8
134: 3BC8 cmp ecx, eax
136: 7E66 jle L3
138: 8D8B00010000 lea ecx, [ebx+100]
144: 8D4710 lea eax, [edi+10]
147: 894DFC move [ebp-4], ecx
150: C17DFC06 sar [ebp-4], 6
154: 8B75FC move esi, [ebp-4]
157: 8B0430 move eax, [eax+esi]
160: 8945F0 move [ebp-10], eax
163: C165F008 shl [ebp-10], 8
167: 8D4710 lea eax, [edi+10]
170: 895DFC move [ebp-4], ebx
173: C17DFC06 sar [ebp-4], 6
177: 8B75FC move esi, [ebp-4]
180: 8B0430 move eax, [eax+esi]
183: C1E008 shl eax, 8
186: 8D7710 lea esi, [edi+10]
189: 8975FC move [ebp-4], esi
192: 894DF8 move [ebp-8], ecx
195: C17DF806 sar [ebp-8], 6
199: C1F808 sar eax, 8
202: 8B75FC move esi, [ebp-4]
205: 0375F8 add esi, [ebp-8]
208: 8906 move [esi], eax
210: 8D7710 lea esi, [edi+10]
213: 8975F8 move [ebp-8], esi
216: 89D8 move eax, ebx
218: C1F806 sar eax, 6
221: 8B75F0 move esi, [ebp-10]
224: 8975FC move [ebp-4], esi
227: C17DFC08 sar [ebp-4], 8
231: 8B4DFC move ecx, [ebp-4]
234: 8B75F8 move esi, [ebp-8]
237: 890C30 move [eax+esi], ecx
L3: 240: 3B55EC cmp edx, [ebp-14]
243: 7C3D jl L6
245: 81C300010000 add ebx, 100
251: 8B4DE8 move ecx, [ebp-18]
254: 3BD9 cmp ebx, ecx
256: 0F8C5AFFFFFF jl L2
L4: 262: 8B5DE4 move ebx, [ebp-1C]
265: 3B5DD8 cmp ebx, [ebp-28]
268: 7C31 jl L7
270: 8B5DE0 move ebx, [ebp-20]
273: 81C300010000 add ebx, 100
279: 895DE0 move [ebp-20], ebx
282: 8B55E0 move edx, [ebp-20]
285: 8B5DDC move ebx, [ebp-24]
288: 3BD3 cmp edx, ebx
290: 0F8C13FFFFFF jl L1
L5: 296: FD std
297: B810000000 move eax, 10
302: C9 leave
303: C20400 ret 4
L6: 306: 81C200010000 add edx, 100
312: 89D3 move ebx, edx
314: E921FFFFFF jmp L2
L7: 319: 8B5DE4 move ebx, [ebp-1C]
322: 81C300010000 add ebx, 100
328: 895DE4 move [ebp-1C], ebx
331: 895DE0 move [ebp-20], ebx
334: E9E8FEFFFF jmp L1
339: 90 nop
340: 90 nop
341: 90 nop
NIL
CL-USER 8 > (main)
Timing the evaluation of (BUBBLESORT X SIZE)
user time = 1.542
system time = 0.000
Elapsed time = 0:00:01
Allocation = 2520 bytes standard / 3146 bytes conses
0 Page faults
NIL
CL-USER 9 >
.
- Follow-Ups:
- Re: Optimizing sort-algorithm
- From: alex
- Re: Optimizing sort-algorithm
- References:
- Optimizing sort-algorithm
- From: alex
- Optimizing sort-algorithm
- Prev by Date: Re: Lisp Syntax - functions versus data
- Next by Date: Re: Lisp Syntax - functions versus data
- Previous by thread: Re: Optimizing sort-algorithm
- Next by thread: Re: Optimizing sort-algorithm
- Index(es):
Relevant Pages
|
|