writing sequences to a file

From: Bart Vandewoestyne (MyFirstName.MyLastName_at_telenet.be)
Date: 03/29/05

  • Next message: glen herrmannsfeldt: "Re: writing sequences to a file"
    Date: Tue, 29 Mar 2005 08:31:10 +0000 (UTC)
    
    

    Hello,

    Suppose that in your work, you are working with sequences of s-dimensional
    points. The sequences have N such points. Both s and N can become very
    large (say about up to 1000 or 10000 for s, and up to 10 million for N).
    All numbers are real values between 0 and 1. I want to write these
    points to a file using as much precision as possible.

    The parameters s and N are only known at run-time, so most of the time
    you work with allocatable arrays for which the size depends on s and N.

    What would be the best format to write such a N x s sequence to a file
    so you can easily compare it with a sequence somebody else sends to you
    in the same format (checking for equality, e.g. the numbers could be N
    s-dimensional points generated by a random number generator).

    The main 'problem' is that i only know s at runtime, so I don't know
    what number to use as my repeat count in my edit descriptors.

    It would be nice if i could do something like this for an array 'sequence'
    consisting of 'row_number' rows and 's' columns:

    write(unit=unit_number, fmt="(sES20.15)") sequence(row_number, :)

    but this has two problems:

    1) I can't use the variable 's' as a repeat count for my edit descriptor
    2) Is there a bound on the repeat count or can i use as many repeat
    counts as I want? Because if s is 1000 or 1000, that would mean i am
    using *a lot* of columns in my output file... i remember having tried
    this once, and then i got some error that i used to many columns or
    something like that... don't know the exact details anymore though... is
    there actually a bound on the number of columns in an output file or
    should it be possible to write say about 10000 reals with 15 decimals in
    one row/line of your output file?

    Another option would be to use a maximum of say 10 columns and spread
    one s-dimensional point over multiple lines, and separate the points
    with an empty line. For example for 15 dimensional sequences:

    0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
    0.11 0.12 0.13 0.14 0.15

    0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
    0.11 0.12 0.13 0.14 0.15

    ... and so on...

    How would you people do this? What would be the best option to store
    the points so you can also easily compare them with the points from
    somebody else who sticks to the same output format?

    Regards,
    Bart

    -- 
    	"Share what you know.  Learn what you don't."
    

  • Next message: glen herrmannsfeldt: "Re: writing sequences to a file"

    Relevant Pages

    • Re: Writing in a file
      ... When I put the \math environment in the ... command, the output file says, instead of the equation ... completely unchanged (except for spaces after control sequences). ...
      (comp.text.tex)
    • Re: Ye Old One Smells Of Pee!
      ... EXONIC SEQUENCES - the ones that get translated into amino acids - are ... GCT GCA ACA GCg gca acc cct gca aca GCT GCA ACA GCt gca ttg att ttt ... Ala Ala Thr Ala Ala Thr Pro Ala Thr Ala..ALA THR ALA Ala ... The first and second amino-acid repeat sequence. ...
      (talk.origins)
    • Re: I want to buy an NSG
      ... > 3) NSG must be capable of producing a different sequence when given a ... > different pattern number. ... > The sequences do not need to be the exact numbers shown above. ...
      (borland.public.delphi.thirdpartytools.general)
    • Re: balanced REDUCE: a challenge for the brave
      ... possible trees for sequences with smallish numbers of elements (up ... version produces a sum-of-depth of 30. ... dup>r execute r> swap>r ... begin r@ while r> swap repeat ...
      (comp.lang.forth)
    • Help: How to use filehandle to save files?
      ... I use Perl to make a script to format EST sequences, ...
      (comp.lang.perl.misc)