Re: how to deal with various representations of matrices ?



On Jun 23, 7:24 pm, p...@xxxxxxxxxxxxxxxxx (Pascal J. Bourguignon)
wrote:
Thibault Langlois <thibault.langl...@xxxxxxxxx> writes:
Hi, I often have to manipulate numerical data (in programs in in the
REPL) that happens to be in various formats (2D arrays, lists of
vectors, vector of vectors, list of lists  etc...). Like many I wrote
a bunch of trivial utilities to:
- compute some stats
- find max and min in columns
- extract parts of the matrix etc...

My problem is as data may be in various formats I have to take this
into account.

Watch Lecture-4b from:http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/



 I see 3 ways:

1. write functions to convert data from each format to a canonical
representation (2D arrays). And code functions only for this
representation. This has the advantage of not duplicating code but the
translation process may be consuming time and space if I have a large
amount of data.

2. use method to specialize on each data format. In order to do this I
have to create a class "list-of-vectors" (for example) in order to be
able to specialize on it. This not very convenient especially when
working with the REPL because I have to use make-instance very often.

3. use a method that specialize on a symbol ex:
(defmethod mean-columns ((data-format (eql :list-of-vectors))
data) ...

This may be a very newbie dilemma but, anyway, is there more clever
ways to deal with this issue ?

Another way to deal with that would be to use an adaptor pattern.  And
since you seem to say that you have more than three representations,
indeed it would be nice to choose one intermediary representation, so
you have to write only 2N adaptors instead of N²-N.

Now, the nice thing is that once you've wrote the 2N adaptors, you can
compose then and compile the N²-N optimized direct adaptors.

But another consideration is that data structures should match the
algorithms (and vice-versa).  If you have an algorithm that needs
direct access to the slots of your sequences, it would be much better
to use a vector than an adaptor over a list.  This would mean
converting the data into the data structure adapted to each algorithm.
If you hide well your representations as indicated in the Lecture-4b,
you could convert from one representation to the other automatically.

(let ((seq (make-instance 'smart-sequence)))
  (insert item seq :at-end)
  (insert item seq :at-end)
  (insert item seq :at-end) ; oh oh! we're just appending items to the end
                            ; perhaps I should switch to a reverse list representation
  (insert item seq :at-end) ; Yay! O(1)
  (insert item seq :at-end)
  ... (assume 10000 more)
  ;; assume a call to sort
  (ref item 5000)              
  (ref item 2500)           ; oh oh! looks like random access
  (ref item 7500)           ; let's switch to a vector representation
  (ref item 1250)           ; Yay! O(1)
  (ref item 3750)
  ... (assume 10000 more)
  )


Thank you for the link and the tips.
I watched the lecture. Very interesting. For me, many of the concepts
that are taught are build-in with CLOS. Anyway the lecture was very
eloquent. It seems clear that, as Alberto suggests, that the full CLOS-
based approach is the best. I suppose the adapters you talk about
should be implemented as change-class methods. However the switch
between representations may be space and time consuming.

(defclass list-of-vectors ()
((data :accessor data :initarg :data)))

(defclass 2d-array ()
((data :accessor data :initarg :data)))


(defmethod sort/column ((a list-of-vectors) col pred)
(setf (data a) (sort (data a) pred
:key #'(lambda (v) (svref v col)))))

(defmethod sort/column ((a 2d-array) col pred)
;; let's be lazy
(setq a (change-class a 'list-of-vectors))
(sort/column a col pred))

(defmethod update-instance-for-different-class :after ((previous 2d-
array) (current list-of-vectors)
&key &allow-
other-keys)
(setf (data current)
(loop
for i below (array-dimension (data previous) 0)
collect (coerce (loop
for j below (array-dimension (data
previous) 1)
collect (aref (data previous) i j))
'vector))))

(defun test ()
(let ((m1 (make-instance 'list-of-vectors :data (list (vector 1 2 3)
(vector 4 5 6))))
(m2 (make-instance '2d-array
:data (make-array '(2 3)
:initial-contents '((1 2
3) (4 5 6))))))
(format t "m1: ~A ~A~%" m1 (data m1))
(format t "m2: ~A ~A~%" m2 (data m2))
(sort/column m1 1 #'>)
(sort/column m2 1 #'>)
(format t "m1: ~A ~A~%" m1 (data m1))
(format t "m2: ~A ~A~%" m2 (data m2))))

CL-USER> (test)
m1: #<LIST-OF-VECTORS {C0A4D31}> (#(1 2 3) #(4 5 6))
m2: #<2D-ARRAY {C0A4DB9}> #2A((1 2 3) (4 5 6))
m1: #<LIST-OF-VECTORS {C0A4D31}> (#(4 5 6) #(1 2 3))
m2: #<LIST-OF-VECTORS {C0A4DB9}> (#(4 5 6) #(1 2 3))




--
__Pascal Bourguignon__                    http://www.informatimago.com/

"You cannot really appreciate Dilbert unless you read it in the
original Klingon"

.



Relevant Pages

  • Re: Complex Subset Computing in Prolog
    ...    variables is unknown build something like distinct. ... There is often confusion when lists and sets are used (in Prolog or ... but let's do this for Prolog) and this pops up ... list representation that is not unique. ...
    (comp.lang.prolog)
  • Re: Complex Subset Computing in Prolog
    ...    variables is unknown build something like distinct. ... There is often confusion when lists and sets are used (in Prolog or ... but let's do this for Prolog) and this pops up ... list representation that is not unique. ...
    (comp.lang.prolog)
  • Re: Complex Subset Computing in Prolog
    ...    variables is unknown build something like distinct. ... There is often confusion when lists and sets are used (in Prolog or ... but let's do this for Prolog) and this pops up ... list representation that is not unique. ...
    (comp.lang.prolog)
  • Re: how to deal with various representations of matrices ?
    ... vectors, vector of vectors, list of lists etc...). ... My problem is as data may be in various formats I have to take this ... representation. ... Now, the nice thing is that once you've wrote the 2N adaptors, you can ...
    (comp.lang.lisp)
  • Re: Modeling Address using Relational Theory
    ... >> address (not a representation of one). ... lists are typically modeled as lists ... One operation on the data is an operation to order addr2 after ... be able to adjust much more handily than if I put addr2 before addr1 on ...
    (comp.databases.theory)

Loading