Re: Is Greenspun enough?
- From: George Neuner <gneuner2/@comcast.net>
- Date: Thu, 08 Dec 2005 15:07:53 -0500
On Thu, 08 Dec 2005 10:40:28 +0100, Ulrich Hobelmann
<u.hobelmann@xxxxxx> wrote:
>George Neuner wrote:
>> On Wed, 07 Dec 2005 09:37:53 +0100, Ulrich Hobelmann
>> <u.hobelmann@xxxxxx> wrote:
>>
>>> George Neuner wrote:
>
>>> mmap()ed files are also cached, no?
>>
>> No. Mapped files are handled by the virtual memory system and all
>> modern systems DMA pages directly to/from disk with no buffering.
>
>Seriously? I thought every common OS would buffer/cache most pages in
>those files, so it doesn't need to access disk every time something is
>read or written (just with normal file reads/writes).
Seriously, you need to read up on MMUs, virtual memory and demand
paging systems.
The VMM accesses the disk only when necessary ... if a page isn't
currently in RAM when it is needed, it will be read in from the disk.
The page then remains accessible in RAM until it is forced out due to
overcommitment according to the replacement policy, whereupon the page
contents are written back to the disk to preserve it. When the last
process unmaps the page and returns it to the OS, the contents are
disgarded.
mmap() tells the operating system to use a particular file/device as
backing store for a particular range of virtual addresses in the
current process. When the map is first established the pages
underlying the range are not valid, so touching a page causes a page
fault and suspends the process. The VMM immediately reads the page
data from the corresponding range in the backing file and then allows
the process to continue. The page is then valid memory and can be
accessed normally until either the process unmaps it or until the VMM
needs the physical page slot for some other purpose - usually because
RAM is overcommitted. If the VMM steals the page slot away, the page
data is preserved by writing it to it's corresponding range in the
backing file. When the process goes to access the page again, the
page will be invalid and the fault handler will read it in again.
Finally munmap() forces all the pages in the range to be written to
the backing store file, then breaks the correspondence between the
address range and the file and unmaps the pages from the process.
[IMO, the control offered by mmap() et al. is too coarse grained.
Windows file mapping API is easier to use and more flexible.
Application level control of the process's virtual memory is one of
the few areas where Windows really outshines Unix/Linux.]
>I wager most libraries and maybe even executables on Unix systems are
>mmap()ed, so I really can't believe those files aren't ever cached, but
>reread from disk on every access...
See above.
>> I haven't investigated exactly *how* current OS loaders handle code
>> relocation.
>
>I think most functions are accessed only indirectly through a table;
>yes, sounds awfully slow, but I too haven't exactly understood the
>badly-documented machine-level stuff that happens inside a typical
>dynamic loader. :(
My point was that code relocation requires adding an offset to every
non-relative internal address reference so that it works at it's new
location. The executable file contains metadata for each code segment
indicatig where these addresses are to be found.
Older systems used to handle relocation by coping the code into swap,
patching up the address references and then executing the modified
code directly from the swap.
I think current systems cache the fix up address map and patch the
code on demand, one page at a time on-the-fly as needed.
>> Some older paging Un*x systems tracked the process working set and
>> tried to keep it together. When the process was scheduled, the system
>> checked the current working set and brought in any missing pages
>> before restarting the process. Perhaps this is what you meant.
>
>No, systems do that (with a global working-set AFAIK), but they allow
>some random big file to take up all cache memory and have application
>files (and executables) either swapped out or just dropped from physical
>memory (for read-only files). That's because they only have basic LRU
>from the '70s, with no priorities attached to memory blocks at all.
I think I understand what you're getting at ... you're talking about
letting applications control some aspects of their own file caching
without having to handle it explicitly. I agree that this could be a
useful thing.
George
--
for email reply remove "/" from address
.
- Follow-Ups:
- Re: Is Greenspun enough?
- From: Duane Rettig
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- References:
- Is Greenspun enough?
- From: Greg Menke
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Re: Is Greenspun enough?
- From: George Neuner
- Re: Is Greenspun enough?
- From: Ulrich Hobelmann
- Is Greenspun enough?
- Prev by Date: Re: Quickit - yet another Reddit clone
- Next by Date: Re: Quickit - yet another Reddit clone
- Previous by thread: Re: Is Greenspun enough?
- Next by thread: Re: Is Greenspun enough?
- Index(es):
Relevant Pages
|
|