Re: Set union





Jon Harrop wrote:
Gordon Sande wrote:

Brooks Moses wrote:

Jon Harrop wrote:
I suspect this is coming from some of your attempts to compare your
OCaml code to Fortran equivalents;


Yes, exactly. The original was actually written in Mathematica but the
implementations are very similar (the Mathematica is much shorter than the
OCaml but also asymptotically slower).


thus, a useful question may be "How does the OCaml compiler implement this in machine code?" The result is
likely to suggest a good way to do things in Fortran.


The OCaml equivalent is the Set module which implements an immutable
balanced binary tree. I doubt this has even been implemented correctly in
Fortran (or C) before though as it is so error prone that you really need
the static analysis provided by something like an ML compiler.

Other potential data structures are the mutable balanced binary tree (which
I used in a C++ equivalent, via the STL set) and a hash set. These
alternatives are likely to be more error prone and more difficult to
optimise but the result should be significantly faster than the OCaml.

As to how this is implemented in machine code by ocamlopt, I've no idea. For
one thing, it's garbage collected...


Me to! James's comment that knowing the operations, and their relative
usage, is very important.
...
Set intersection is trivial and set union is straight forward and acceptable if not a major operation. Anyone using
sparse matrices will recognize this as a standard sparse matrix idiom.


I agree entirely. In this case, set union is the main operation.

A further indication of the application would help so that others do not have to wade through your code. Are the sets disjoint? Are they built by accretion or direct specification? What do you do with the results? Are side conditions like sorted order an issue?

My guess is that a mutable hash table would work best in Fortran. However, I
have no idea if this is correct or feasible.

Codes that rely on system storage allocators can often be improved by use of more purpose specific allocation methods. That usually means suballocation of larger storage blocks in practice. In F77 that meant subscripts instead of machine pointers (not F90 style pointers) inside fixed arrays. The Fortran gestalt is to not rely on hidden facilities as they can often lead the unwary into inefficiency. This has the down side that the user has to implement more infrastructure. Your comments on the efficiency of Mathematica vrs OCaml are exactly why Fortran has not had such hidden facilities. F90 has provided many more standard facilities but not gone in for "exotica".

There is no reason why any explicitly stated algorithm can not be
implemented in Fortran. Just do not expect to find fashionable
"algorithm of the day" methods in the Fortran manuals. I have done
"heads down" translation of C into F90 and gotten running algorithms
that resemble no common Fortran idiom as all storage was allocated
via pointers. I viewed it as trusting my luck to the system allocators.
Interestingly the application was nearest neighbor searching and related
graph algorithms.




.



Relevant Pages

  • Re: A question on Newtons Method
    ... bug reports per day. ... there are arbitrary-precision integers and rationals (Num in OCaml, ... native format in Mathematica, and a major hassle in Fortran). ...
    (sci.math.num-analysis)
  • Re: OCAMl a more natural extension language for python?
    ... The open source g95 Fortran 95 compiler is already usable and will be ... I don't use OCAML, so I looked at some OCAML code to multiply matrices ... In Python with Numeric it's just ... be considerably more low-level than Python with Numeric/Numarray or ...
    (comp.lang.python)
  • Re: Challenge
    ... > As the programs are designed to demonstrate the capabilities of OCaml, ... > of them will not be easy to convert to Fortran and, I believe, are likely ... > to be verbose and slow. ... OCaml is obviously a "terse syntax" language that eschews type ...
    (comp.lang.fortran)
  • Re: Set union
    ... >> Jon Harrop wrote: ... OCaml but also asymptotically slower). ... >> likely to suggest a good way to do things in Fortran. ... As to how this is implemented in machine code by ocamlopt, ...
    (comp.lang.fortran)
  • Re: Matrix Multiplication
    ... to be more a setup for language wars than about Matrix Programming. ... I am using LAPACK that is written in Fortran 77 and MUMPS ... would not seem to make it as a reason for ignoring F90 in an exposition. ... contemporary style of the array notation that is available in many ...
    (sci.math.num-analysis)