Re: writing output to textfile



"(-Peter-)" <garfieldpbj@xxxxxxxxx> wrote in message
news:60fdebea-f0b8-4ae7-b9fa-a81e52e4705e@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hi..

I've been using java to print out 10 millions lines like "int
double" (where int and double is numbers for example 1 and
1,932131293)

I printed it to a .txt file and, the result was a 300 mb file - isn't
this "too much".

I'm using the output in octave(="free matlab"), where I load the file
as a matrix - this takes very long time, and uses all my memory when I
manipulate the data (I have 2 gb, which I think is "enough")

Is it normal that 10 millions of lines would use that much space?

Is there another way to save the output which does not require that
amount of space (and memory usage in octave).

You can obviously save space by writing binary - investigate
DataOutputStream on the Java end. When reading in you'll be using fread();
the most efficient way to use this, if you have different data types being
written out, is perhaps to write out column by column...that is, N ints, N
doubles and so forth. Then you can use fread to read N ints or N doubles
into column or row vectors, at one fell swoop, and concatenate as necessary.
To be honest I don't exactly see how ints and doubles are mingling in a
matrix anyway, so they are probably logically separate.

Bear in mind that 10 million records composed of one int (4 bytes) and one
double (8 bytes) are going to be 120 million bytes anyhow, which is pretty
big for processing even with 2 GB of RAM on your machine. My gut feeling is
that you may want to look carefully at exactly what it is you're doing with
the data, and see if the processing actually requires that all of the data
be available at once.

AHS


.



Relevant Pages