Re: moving some native-formatted fortran data from linux/x86 to aix/pSeries



Dan Stromberg wrote:

I'm working with a researcher on getting his data converted from native
linux/x86 format, to something xlf90 would be happy with on AIX.  I'm not
just sure which fortran compiler he's using on the linux side, but then
I'm not sure if it matters - it could be gcc's fortran or fortran 95, or
it could be something commercial like portland group or so.

Do you mean Fortran UNFORMATTED files?

The data files certainly contain real*4's, and may contain other things as
well.

I wrote a quick little C program that assumed everything in the file was a
real*4 on a linux system (all linuxes concerned are Fedora Core 3, so far
at least), and converted that to ASCII. Then I wrote another little C
program that would read a bunch of real*4's as ASCII, and convert that to
AIX real*4.  However, the researcher is telling me that this produced
incorrect results when he used the resulting data file in his application,
so I'm looking more deeply now.

Do all the numbers look like the expected REAL*4 numbers?
Fortran UNFORMATTED files need at least enough information to identify record boundaries. A single READ statement will read the record written by a single WRITE statement, with the amount read less than or equal to the amount written.


You might be reading record marks as REAL*4. Most likely they would be very small numbers as REAL*4.

Anyway, my questions are:

1) Are there any compiler options that might assist us in this transition?

2) Is there a way of making one or more of the fortran compilers use
unbuffered I/O, so I can use strace and/or truss to see how large
chunks the programs are reading and writing data in?  I'm guessing that
this might also help us identify if the data files have a single,
consistent record length, or if the data is variable length records.

3) Am I correct in assuming that a C program should be able to use
real*4's in the same format that fortran will on these platforms?

If you ignore the record marks, which are likely the record length,
most likely the rest will be appropriately formatted, unless the byte order (endianness) is different. x86 is little endian, I am not sure about AIX on PowerPC, though you don't say which processor.


Yes, we do have source code.  Yes, we did study it a bit, and it -seemed-
like it was writing consistent, 44 byte fixed-width records, but then the
file length isn't evenly divisible by 44, so....

Easiest is to write a small Fortran program to write out one record, and look at that, possibly with od -x, hexdump, or some similar program.

BTW, this research group is moving most of their binary data to NetCDF for
the long term, so they do know how to avoid this kind of issue in the
future.

-- glen

.