Re: How to improve my summarizing code



Geoffrey King <lordergeoffrey@xxxxxxxxxxxxxxxx> writes:

I am trying to 'correct' my c++/java training and get the lisp
feeling. Below is my third attempt at the problem. I am looking for
pointers on improving it, or even better changing my approach. Any
comments are appreciated.

The problem is summarizing or rolling up numeric values found in a in
a list of lists. An example of the structure would be:
((a b 1) (a b 2) (a c 3) (a c 7))
Yes it is a simplified database table style structure and does have
uniform 'columns'. The result of summarizing should produce a result
of:
((a b 3) (a c 10))

One other criteria is that result should conform to this general
structure, otherwise a hash may tempt me.

If you don't need the result sorted, then using a hash table you can
reach a complexity of O(n) while sort is O(n*log(n)). And when you
use APPEND in a loop, you usually get O(n²).


(defun key (row) (first row))
(defun value (row) (second row))
(defun make-row (key value) (list key value))

(defun summarize (rows)
(loop with h = (make-hash-table :test (function equal)) ; for complex keys
for row in rows
do (incf (gethash (key row) h 0) (value row))
finally (let ((result '()))
(maphash (lambda (k v) (push (make-row k v) result)) h)
(return result))))

(summarize '((c 7) (a 1) (a 3) (b 1) (b 10) (b 100)))
--> ((C 7) (A 4) (B 111))


(defun key (row) (subseq row 0 2))
(defun value (row) (elt row 2))
(defun make-row (key value) (append key (list value)))
(summarize '((a b 1) (a b 2) (a c 3) (a c 7)))
--> ((A B 3) (A C 10))


You probably need a more sophisticated data structure for rows,
involving a vector of fields and some meta-data.

Also, you can easily keep more data than the sum in the hash table:
....
(incf (cdr (gethash (key row) h (cons (data row) 0))) (value row))
....
(push (make-row k (car v) (cdr v)) result)
....
Or you can accumulate several columns at the same time, etc...

--
__Pascal Bourguignon__ http://www.informatimago.com/

"Specifications are for the weak and timid!"
.



Relevant Pages

  • Re: How to count value in a ArrayList
    ... were adding was not already stored in the data structure. ... Doing this with a ArrayList is a lot slower than a hash ... But presumably they use a relatively common technique of having a fixed-sized array of bins that grows as the hash table accumulates items. ... Some implementations use bins that are either arrays or linked lists, while others simply use the next available spot in the hash table. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: beginning with ML
    ... What's the preferred data structure? ... Lists are commonly used, being functional, but hash tables are also ... I use the RedBlackTree implemenation in the utility library ...
    (comp.lang.functional)
  • Re: Tic-Tac-Toe help
    ... details of the actual data structure used doesn't import much indeed. ... It could be anything including list of lists, arrays, property lists, ... (defun coordinates-from-square-name (sqname) ... Then you can implement the grid abstraction, ...
    (comp.lang.lisp)
  • Re: Which data structure to use
    ... When I say lists, ... I implemented a Queue data structure as that adds and accesses ... List probably in combination with array or hash down below at point 2. ...
    (comp.graphics.algorithms)
  • Re: Tic-Tac-Toe help
    ... details of the actual data structure used doesn't import much indeed. ... It could be anything including list of lists, arrays, property lists, ... (defun coordinates-from-square-name (sqname) ... Then you can implement the grid abstraction, ...
    (comp.lang.lisp)