Re: Is Greenspun enough?
- From: Ulrich Hobelmann <u.hobelmann@xxxxxx>
- Date: Thu, 08 Dec 2005 10:40:28 +0100
George Neuner wrote:
On Wed, 07 Dec 2005 09:37:53 +0100, Ulrich Hobelmann <u.hobelmann@xxxxxx> wrote:
George Neuner wrote:But they cache them, and the swapping-out becomes noticeable as memory needs to be freed to store, say, heap objects.Unused code/data doesn't waste RAM - only virtual addresses and disk swap space. Most OSes memory map executables directly from the file system so code doesn't pollute the file cache or swap space.
Interesting ... I don't know many people who consider virtual memory to be a "cache" ... at least not unless the discussion involves SVM coherence.
No, virtual memory is just VM. But stuff that is actually populated needs physical memory (or disk swap) to live in, and as soon as a file is accessed most OSes swap everything out (or just clear the caches for executable files and other resources those programs might be using) to cache the stupid huge file.
Later you want to work with the program again, and every file it uses, maybe even the dynamic libraries, has to be paged in again. That's often much slower than starting a program from scratch after boot-up. I suppose file-systems have better read locality, but swap space isn't organized very well (sequentially).
Anyway, "swapping" implies data movement which is not necessarily the case. Paging in always involves a read from disk, but when paging out only writable pages whose contents have changed need to be stored back to disk. Read-only pages, and writable data pages that have been through at least one out-in cycle and still remain unmodified, will simply be overwritten because their contents can be reread from the disk.
Yes, but that takes time too, and why at all cache a file that's just used once (media stream) and then probably never accessed again?
mmap()ed files are also cached, no?
No. Mapped files are handled by the virtual memory system and all modern systems DMA pages directly to/from disk with no buffering.
Seriously? I thought every common OS would buffer/cache most pages in those files, so it doesn't need to access disk every time something is read or written (just with normal file reads/writes).
Obviously there is a mechanism to locate the disk block, so there is indexing information for open mapped files and swap devices kept in RAM, but the contents of the files are not cached.
Yeah, swap isn't. :)
If you read the documentation regarding mmap(), or the Windows equivalent CreateFileMapping(), you will see scary warnings telling you not to use normal I/O calls on a file while it is mapped. The result of doing so is unspecified and can be, um ... bad!
Mapping data files is straightforward, but executables have a twist.
I wager most libraries and maybe even executables on Unix systems are mmap()ed, so I really can't believe those files aren't ever cached, but reread from disk on every access...
Modern compilers/linkers produce relocatable code with metadata which allows the OS loader to rebase the code for a different load address. COFF, ELF and PE files contain relocation metadata as well as code.
The compiler's default base address works fine for programs ... you only load 1 program into the process address space ... but trying to use multiple DLLs at the same base address won't work - only the first library loaded can use the address and all the others must be relocated.
Yep.
I haven't investigated exactly *how* current OS loaders handle code relocation. It used to be done by making a modified copy on the swap device and running it from there. I suspect that now it is done on demand by the page fault handler according to whatever process/mapping combination currently controls the page ... that would seem to make sense even if it slows page handling a bit because it conserves swap space for data pages and allows different code mappings to share the same virtual pages.
I think most functions are accessed only indirectly through a table; yes, sounds awfully slow, but I too haven't exactly understood the badly-documented machine-level stuff that happens inside a typical dynamic loader. :(
The dll itself (i.e. the code) is probably just mmap()ed and so demand-paged.
I'd like to be able to tell the OS not to cache whatever file I'm reading sequentially (or just the "current" couple of 100k).
I think for that you'll have to get friendly with mmap() and/or CreateFileMapping().
No, I'm just talking about the way memory is handled on most current systems. If I had time to do my own system, it'd include a file-open flag called don't-cache-this-file, so the file would be cached with lowest priority (dropped from cache as soon as anything else needed memory). I'm not sure most current system even have memory priorities (like they do have scheduling priorities), only some basic LRU.
Or to cache executable contents over big data file chunks, so I don't have to wait so much for applications to swap back in.
Not exactly sure what you mean here, but modern systems don't swap processes ... just pages. The code you want to execute or the data it needs might be on the _only_ page in the whole application that is not currently in RAM [unlikely but possible].
When applications are swapped in, it usually takes LONG, so I actually close them whenever I don't have any state open in it. Ok, now with 1GB I'm not that picky anymore, but still I notice that when music or video files fill up my cache, there's a small slowdown when re-starting apps, because their files aren't in the cache anymore.
Some older paging Un*x systems tracked the process working set and tried to keep it together. When the process was scheduled, the system checked the current working set and brought in any missing pages before restarting the process. Perhaps this is what you meant.
No, systems do that (with a global working-set AFAIK), but they allow some random big file to take up all cache memory and have application files (and executables) either swapped out or just dropped from physical memory (for read-only files). That's because they only have basic LRU from the '70s, with no priorities attached to memory blocks at all.
WS swapping was popular for a while, but it was abandoned by the time Unix System III arrived. It was found to cause thrashing in busy systems and proved to be difficult to tune effectively for different hardware configurations. If RAM is overcommitted, trying to bring in all the WS pages for one process just pushes out needed WS pages for other processes. Great disk workout ensues.
Yes, that's the problems of local instead of global policies. Since it's the OS's memory, the OS can best manage it itself.
Ultimately OS developers settled on the current system - single page
demand replacement - as the best compromise solution. It made sense
to abandon WS swapping when memory was tight. Now with huge memories
becoming the norm it might make sense to revisit it as a performance
boost.
No, I only want priorities for memory blocks. I'd leave the paging inside the kernel, but users could tell what priorities they have (like: I won't ever need this mp3 file again, but please cache my images and executable file with priority "normal"). Something like BeOS did with threads, so user-perceived performance (or latency) improves.
-- Majority, n.: That quality that distinguishes a crime from a law. .
- Follow-Ups:
- Re: Is Greenspun enough?
- From: Waldek Hebisch
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- References:
- Is Greenspun enough?
- From: Greg Menke
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- From: George Neuner
- Is Greenspun enough?
- Prev by Date: Re: Reddit: Let's rewrite a better one in Common Lisp.
- Next by Date: Re: Link-it: The nextgen reddit written in Common Lisp is online
- Previous by thread: Re: Is Greenspun enough?
- Next by thread: Re: Is Greenspun enough?
- Index(es):
Relevant Pages
|
|