Re: python: ascii read

From: Heiko Wundram (heikowu_at_ceosg.de)
Date: 09/16/04


To: python-list@python.org
Date: Thu, 16 Sep 2004 18:56:23 +0200

Am Donnerstag, 16. September 2004 17:56 schrieb Brian van den Broek:
> But I don't really feel I've a handle on the significance of saying it
> maps the file into memory versus reading the file. The naive thought is
> that since the data gets into memory, the file must be read. But this
> makes me sure I'm missing a distinction in the terminology. Explanations
> and pointers for what to read gratefully received.

read()ing a file into memory does what it says; it reads the binary data from
the disk all at once, and allocates main memory (as needed) to fit all the
data there. Memory mapping a file (or device or whatever) means that the
virtual memory architecture is involved. What happens here:

mmapping a file creates virtual memory pages (just like virtual memory which
is put into your paging file), which are registered with the MMU of the
processor as being absent initially.

Now, when the program tries to access the memory page (pages are some fixed
short length, like 4k for most Pentium-style computers), a (page) fault is
generated by the MMU, which invokes the operating system's handler for page
faults. Now that the operating system sees that a certain page is accessed
(from the page address it can deduce the offset in the file that you're
trying to access), it loads the corresponding page from disk, and puts it
into memory at some position, and alters the pagetable entry in the LDT to be
present.

Future accesses to the page will take place immediately (without a page fault
taking place).

Changes in memory are written to disk once the page is flushed (meaning that
it gets removed from main memory because there are too few pages available of
real main memory). Now, when a page is forcefully flushed (not due to closing
the mmap), the operating system marks the pagetable entry in the LDT to be
absent again, and the next time the program tries to access this location, a
page-fault again takes place, and the OS can load the page from disk.

For speed, the operating system allows you to mmap read-only, which means that
once a page is discarded, it does not need to be written back to disk (which
of course is faster). Some MMUs (IIRC not the Pentium-class MMU) set a dirty
bit on the page-table entry once the page has been altered, this can also be
used to control whether the page needs to be written back to disk after
access.

So, basically what you get is load on demand file handling, which is similar
to what the paging file (virtual memory file) on win32 does for allround
memory. Actually, internally, the architecture to handle mmapped files and
virtual memory is the same, and you could think of the swap file as an
operating system mmapped file, from which programs can allocate slices
through some OS calls (well, actually through the normal malloc/calloc
calls).

HTH!

Heiko.



Relevant Pages

  • Re: [Lit.] Buffer overruns
    ... > floating point support or a memory expansion option. ... had virtual memory support grafted on. ... > where the modified instruction was fetched from. ... vis-a-vis the official coporate strategic operating system TSS/360. ...
    (sci.crypt)
  • Proposed Assembler Commands
    ... ACM Automatically Clear Memory ... BKCRDR Backspace Card Reader ... BKSPD Backspace Disk ... EIAO Execute In Any Order ...
    (sci.electronics.design)
  • Re: Att. Alex Nichol -VM cont.
    ... I find documentation that says a 4kb page in memory is written to the hard ... manner that a 4kb memory page is equal to a 4kb cluster written on the disk". ... So where is the basis for saying since 4kb paging in memory that 4kb clusters ...
    (microsoft.public.windowsxp.general)
  • Re: Slow performance
    ... Leaving your computer on 24/7 means that if any programme has a memory ... The Disk Defragmenter report shows you have minimal free disk space. ... select Properties, General, Advanced and check the box before Compress ... Volume fragmentation ...
    (microsoft.public.windowsxp.perform_maintain)
  • Re: Slow performance
    ... I realize that the pagefile usage is excessive and I need to reduce memory ... I've never messed with those settings before. ... Peak are greatly in excess of the installed RAM. ... The Disk Defragmenter report shows you have minimal free disk space. ...
    (microsoft.public.windowsxp.perform_maintain)