Re: Is Greenspun enough?



George Neuner wrote:
On Wed, 07 Dec 2005 09:37:53 +0100, Ulrich Hobelmann
<u.hobelmann@xxxxxx> wrote:

George Neuner wrote:
Unused code/data doesn't waste RAM - only virtual addresses and disk
swap space.  Most OSes memory map executables directly from the file
system so code doesn't pollute the file cache or swap space.
But they cache them, and the swapping-out becomes noticeable as memory needs to be freed to store, say, heap objects.

Interesting ... I don't know many people who consider virtual memory to be a "cache" ... at least not unless the discussion involves SVM coherence.

No, virtual memory is just VM. But stuff that is actually populated needs physical memory (or disk swap) to live in, and as soon as a file is accessed most OSes swap everything out (or just clear the caches for executable files and other resources those programs might be using) to cache the stupid huge file.


Later you want to work with the program again, and every file it uses, maybe even the dynamic libraries, has to be paged in again. That's often much slower than starting a program from scratch after boot-up. I suppose file-systems have better read locality, but swap space isn't organized very well (sequentially).

Anyway, "swapping" implies data movement which is not necessarily the
case.  Paging in always involves a read from disk, but when paging out
only writable pages whose contents have changed need to be stored back
to disk.   Read-only pages, and writable data pages that have been
through at least one out-in cycle and still remain unmodified, will
simply be overwritten because their contents can be reread from the
disk.

Yes, but that takes time too, and why at all cache a file that's just used once (media stream) and then probably never accessed again?


mmap()ed files are also cached, no?

No. Mapped files are handled by the virtual memory system and all modern systems DMA pages directly to/from disk with no buffering.

Seriously? I thought every common OS would buffer/cache most pages in those files, so it doesn't need to access disk every time something is read or written (just with normal file reads/writes).


Obviously there is a mechanism to locate the disk block, so there is
indexing information for open mapped files and swap devices kept in
RAM, but the contents of the files are not cached.

Yeah, swap isn't. :)

If you read the documentation regarding mmap(), or the Windows
equivalent CreateFileMapping(), you will see scary warnings telling
you not to use normal I/O calls on a file while it is mapped.  The
result of doing so is unspecified and can be, um ... bad!

Mapping data files is straightforward, but executables have a twist.

I wager most libraries and maybe even executables on Unix systems are mmap()ed, so I really can't believe those files aren't ever cached, but reread from disk on every access...


Modern compilers/linkers produce relocatable code with metadata which
allows the OS loader to rebase the code for a different load address.
COFF, ELF and PE files contain relocation metadata as well as code.

The compiler's default base address works fine for programs ... you
only load 1 program into the process address space ... but trying to
use multiple DLLs at the same base address won't work - only the first
library loaded can use the address and all the others must be
relocated.

Yep.

I haven't investigated exactly *how* current OS loaders handle code
relocation.  It used to be done by making a modified copy on the swap
device and running it from there.  I suspect that now it is done on
demand by the page fault handler according to whatever process/mapping
combination currently controls the page ... that would seem to make
sense even if it slows page handling a bit because it conserves swap
space for data pages and allows different code mappings to share the
same virtual pages.

I think most functions are accessed only indirectly through a table; yes, sounds awfully slow, but I too haven't exactly understood the badly-documented machine-level stuff that happens inside a typical dynamic loader. :(


The dll itself (i.e. the code) is probably just mmap()ed and so demand-paged.

I'd like to be able to tell the OS not to cache whatever file I'm reading sequentially (or just the "current" couple of 100k).

I think for that you'll have to get friendly with mmap() and/or CreateFileMapping().

No, I'm just talking about the way memory is handled on most current systems. If I had time to do my own system, it'd include a file-open flag called don't-cache-this-file, so the file would be cached with lowest priority (dropped from cache as soon as anything else needed memory). I'm not sure most current system even have memory priorities (like they do have scheduling priorities), only some basic LRU.


Or to cache executable contents over big data file chunks, so I don't have to wait so much for applications to swap back in.

Not exactly sure what you mean here, but modern systems don't swap processes ... just pages. The code you want to execute or the data it needs might be on the _only_ page in the whole application that is not currently in RAM [unlikely but possible].

When applications are swapped in, it usually takes LONG, so I actually close them whenever I don't have any state open in it. Ok, now with 1GB I'm not that picky anymore, but still I notice that when music or video files fill up my cache, there's a small slowdown when re-starting apps, because their files aren't in the cache anymore.


Some older paging Un*x systems tracked the process working set and
tried to keep it together.  When the process was scheduled, the system
checked the current working set and brought in any missing pages
before restarting the process.  Perhaps this is what you meant.

No, systems do that (with a global working-set AFAIK), but they allow some random big file to take up all cache memory and have application files (and executables) either swapped out or just dropped from physical memory (for read-only files). That's because they only have basic LRU from the '70s, with no priorities attached to memory blocks at all.


WS swapping was popular for a while, but it was abandoned by the time
Unix System III arrived.  It was found to cause thrashing in busy
systems and proved to be difficult to tune effectively for different
hardware configurations.  If RAM is overcommitted, trying to bring in
all the WS pages for one process just pushes out needed WS pages for
other processes.  Great disk workout ensues.

Yes, that's the problems of local instead of global policies. Since it's the OS's memory, the OS can best manage it itself.


Ultimately OS developers settled on the current system - single page
demand replacement - as the best compromise solution. It made sense
to abandon WS swapping when memory was tight. Now with huge memories
becoming the norm it might make sense to revisit it as a performance
boost.

No, I only want priorities for memory blocks. I'd leave the paging inside the kernel, but users could tell what priorities they have (like: I won't ever need this mp3 file again, but please cache my images and executable file with priority "normal"). Something like BeOS did with threads, so user-perceived performance (or latency) improves.


--
Majority, n.: That quality that distinguishes a crime from a law.
.



Relevant Pages

  • Re: [PATCH 0/8] zcache: page cache compression support
    ... Memory compression increases effective memory size and allows more ... chance' cache ... though there looked like still lots of swap. ... 0: Mallocing 32 megabytes ...
    (Linux-Kernel)
  • Re: [PATCH 0/8] zcache: page cache compression support
    ... Memory compression increases effective memory size and allows more ... chance' cache ... By tested those patches on the top of the linus tree at this commit d0c6f6258478e1dba532bf7c28e2cd6e1047d3a4, the OOM was trigger even though there looked like still lots of swap. ... 0: Mallocing 32 megabytes ...
    (Linux-Kernel)
  • Re: Is Greenspun enough?
    ... Most OSes memory map executables directly from the file ... >> system so code doesn't pollute the file cache or swap space. ... but executables have a twist. ...
    (comp.lang.lisp)
  • Re: Why sticky bit on executables?
    ... non-sticky'ed executables, once they were run... ... No need with the Virtual Memory we have today. ... memory and not have to swap things out to get real contiguous ... set - eg the most used files such as sh, vi, and others, to reside ...
    (comp.unix.sco.misc)
  • Re: IA64 Linux VM performance woes.
    ... > At first the throughtput we are getting without file cache bypass is at around ... you're used to IRIX (or ever used the 2.6 layer). ... > and eventually all memory gets occupied by FS pages. ... That suggests you may be running with not much swap. ...
    (comp.sys.sgi.admin)