Re: disk file reads slow down for file sizes greater than 2 GB

From: Richard Maine (nospam_at_see.signature)
Date: 03/18/04


Date: 18 Mar 2004 08:42:12 -0800

shaeffer@bigzoo.net (John S) writes:

> Thus this appears to be a data stride/OS cache/disk cache issue and
> not one of addressing limits.

That's my guess, though I wasn't going to take the time to experiment.
Would take a fair amount of work to experiment comprehensively (and
ideally several different machines), and I don't even have any good
candidates to play with conveniently at work. (For some things,
the virtual PC on my Mac or Linux boxes is ok, but I think not
for this - it would just muddy the water).

Anyway, to add one constructive input other than just give
excuses, be sure to also consider the interaction with virtual
memory swapping. That could make it hard to match your results
with trivial bennchmark programs, which might not have the same
kinds of access patterns to program memory, even if they had the
same access patterns to the data files.

I'm thinking that you are probably taking a big hit from having
virtual memory paging and heavy data file I/O both at the same
time. Your figures of factors of 30 or so (I think that's what
I recalled) sound quite plausible for that kind of situation.

-- 
Richard Maine                       |  Good judgment comes from experience;
email: my first.last at org.domain  |  experience comes from bad judgment.
org: nasa, domain: gov              |        -- Mark Twain


Relevant Pages

  • Re: High-bandwidth computing interest group
    ... sequential access patterns, brute force - neither of us consider that interesting ... Perhaps we should lose the cache line orientation - transferring data bytes that aren't needed. ... Particularly if it has scatter/gather vector instructions like Larrabee, or if it is a CIMT coherent threaded architecture like the GPUs. ... As I have discussed in this newsgroup before, this allows us to have writeback caches where multiple processors can write to the same memory location simultaneously. ...
    (comp.arch)
  • How to tell what is using memory
    ... My script uses up a lot of memory because it is reading large data files. ... Perl doesn't normally release any memory back to the operating system until ...
    (comp.lang.perl.misc)
  • Re: TextOut() to a DialogBox ???
    ... Looking at the recent "memory read speed test" code you ... Sufficient for its intended purpose. ... process with widely dispersed memory access patterns, ...
    (microsoft.public.vc.mfc)
  • Re: Reading and evaluating huge ASCII Files
    ... The first column denotes the headline of the ... the memory problem. ... with data files in the 800MB range. ... Setting the /3GB switch causes the OS to allocate 3GB to applications ...
    (comp.soft-sys.matlab)
  • Re: IStorage/IStream
    ... When the application closes, it uses IStorage/IStream to move them into one file, from the user's point of view. ... When the application is started next time, these data files are extracted from the one file. ... By extraction, I mean to programmatically read from IStreams and then write to new, separate data files. ... The main issue to me is not disk memory space, ...
    (microsoft.public.vc.mfc)