Re: massive data analysis with lisp
- From: fofiko@xxxxxxxxxxxxxxx
- Date: 14 Oct 2006 01:04:34 -0700
Yes, an excellent additional trick!
To finish complete your thought, all one needs to do is:
(file-position data-stream indexed-position)
(read data-stream)
and you've got the data nearly instantly! In this manner you can get
hold of as many or few of the entries totally at random access.
Thanks for the extension!
To finish up on the subject this is the code for a second index
(customer->ratings).
(defun read-mov-idx ()
(with-open-file (i "mov_index.txt")
(let ((htable (make-hash-table)))
(loop for res = (read i nil)
sum 1 into k
while res do
(setf (gethash k htable) res))
htable)))
(defun make-cust-idx (movidx)
(let ((assoctab (make-hash-table)))
(with-open-file (ifile "data.lisp")
(loop for k being the hash-keys in movidx using (hash-value v) do
(file-position ifile v)
(let ((res (cdr (read ifile))))
(loop for elem in res do
(let ((custid (first elem)))
(if (not (gethash custid assoctab))
(setf (gethash custid assoctab)
(make-array 1 :element-type 'fixnum
:fill-pointer 0 :adjustable
t)))
(vector-push-extend k (gethash custid
assoctab)))))))
(with-open-file (ofidx "cust_index.txt" :direction :output
:if-exists :supersede)
(with-open-file (ofcust "cust.lisp" :direction :output
:if-exists :supersede)
(loop for k being the hash-keys in assoctab using
(hash-value v) do
(format ofidx "~A ~A~%" k (file-position ofcust))
(format ofcust "~S~%" v))))))
It requires about 600mb of mem in order to build the index and after
its
done cust.lisp takes up 550mb and cust_index.txt about 8.5mb.
So with both indices in place and resident in memory, total memory
requirements
are about 8.6mb for a very fast way to get to your data at minimum
time wasted, no sql
and fully integrated with lisp ;-)
I'm sure someone could optimize this further if needed to reduce memory
used at build
time.
.
- Follow-Ups:
- Re: massive data analysis with lisp
- From: remixer
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- References:
- massive data analysis with lisp
- From: remixer
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- From: remixer
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- From: remixer
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- From: K Livingston
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- From: Thomas A. Russ
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- From: grackle
- Re: massive data analysis with lisp
- From: JShrager
- Re: massive data analysis with lisp
- From: fofiko
- Re: massive data analysis with lisp
- From: JShrager
- massive data analysis with lisp
- Prev by Date: Re: Lisp dinamism vs ML safety
- Next by Date: Re: Quick way to figure out if floats are boxed in arrays?
- Previous by thread: Re: massive data analysis with lisp
- Next by thread: Re: massive data analysis with lisp
- Index(es):
Relevant Pages
|