Re: Speed: a column of floats from large file



Christophe Rhodes <csr21@xxxxxxxxx> writes:

> Thanks for bringing this (back :-) to my attention.
>
> Self Cumul Total
> Nr Count % Count % Count % Function
> ------------------------------------------------------------------------
> 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET
> 2 289 20.1 1084 75.2 689 47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII
> 3 241 16.7 324 22.5 930 64.5 ARRAY-DIMENSION
>
> This is telling us that 75% of the time is spent in
> FD-STREAM-READ-N-CHARACTERS/ASCII and callees (and so that you're
> likely using an ascii external format for this profile run), and that
> about half of that time (32%) is likely spent calling
> HAIRY-DATA-VECTOR-SET. HAIRY-DATA-VECTOR-SET is called for (SETF
> AREF) when the compiler didn't have any information about the
> simpleness and specialized array element type of the destination
> array, and tends to be slow because it's one huge ugly typecase (which
> SBCL currently does not optimize to a constant-time jump-table).

Thank you for explaining the details.

> <http://paste.lisp.org/display/9894> contains a candidate patch which
> might improve the situation; it would be good if interested parties
> could test it against their workloads and report any effect.

I built a new SBCL with the patch and the results are definitely
better.

Without the patch:

CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
Evaluation took:
23.542 seconds of real time
19.549028 seconds of user run time
1.352794 seconds of system run time
0 page faults and
254,086,856 bytes consed.

Self Cumul Total
Nr Count % Count % Count % Function
------------------------------------------------------------------------
1 47 24.9 60 31.7 47 24.9 (SB-C::TL-XEP SB-KERNEL:HAIRY-DATA-VECTOR-SET)
2 41 21.7 139 73.5 88 46.6 (SB-C::TL-XEP SB-IMPL::FD-STREAM-READ-N-CHARACTERS/LATIN-1)
3 40 21.2 181 95.8 128 67.7 (SB-C::HAIRY-ARG-PROCESSOR READ-LINE)
4 32 16.9 38 20.1 160 84.7 (SB-C::TL-XEP ARRAY-DIMENSION)
5 19 10.1 19 10.1 179 94.7 (SB-C::TL-XEP ARRAY-RANK)

And with the patch:

CL-USER[3]: (time (dotimes (i 10) (parse-file-2 "/tmp/data")))
Evaluation took:
7.783 seconds of real time
6.419024 seconds of user run time
1.346795 seconds of system run time
0 page faults and
254,084,160 bytes consed.

Self Cumul Total
Nr Count % Count % Count % Function
------------------------------------------------------------------------
1 29 44.6 56 86.2 29 44.6 (SB-C::HAIRY-ARG-PROCESSOR READ-LINE)
2 17 26.2 25 38.5 46 70.8 (SB-C::TL-XEP SB-IMPL::FD-STREAM-READ-N-CHARACTERS/LATIN-1)
3 8 12.3 8 12.3 54 83.1 "foreign function __read"
4 5 7.7 5 7.7 59 90.8 "no debug information for frame"
5 2 3.1 2 3.1 61 93.8 "foreign function pthread_sigmask"

PARSE-LINE-2 is defined as:

(defun parse-file-2 (file)
(let ((count 0))
(with-open-file (in file :external-format :iso-8859-1)
(do ((line (read-line in nil :eof) (read-line in nil :eof)))
((eql line :eof) t)
(incf count)))
count))

It was reading a log file with about 75k lines that were less than 80
chars in length. Is there any way to improve it further with another
2 line patch :)

Thanks,
Pete
.