Re: bz2 & cpu usage

From: Brad Tilley (bradtilley_at_gmail.com)
Date: 10/20/04


Date: Wed, 20 Oct 2004 11:45:53 -0400

Kirk Job-Sluder wrote:
> Sorry for the late post, the original scrolled off the server.
>
> > I'd like to keep at least 50% of the cpu free while doing bz2 file
> > compression. Currently, bz2 compression takes between 80 & 100 percent
> > of the cpu and the Windows GUI becomes almost useless. How can I lower
> > the strain on the cpu and still do compression? I'm willing for the
> > compression process to take longer.
> >
> > Thanks,
> >
> > Brad
> >
> > def compress_file(filename):
> > path = r"C:\repository_backup"
> > print path
> > for root, dirs, files in os.walk(path):
> > for f in files:
> > if f == filename:
> > print "Compressing", f
> > x = file(os.path.join(root, f), 'rb')
> > os.chdir(path)
> > y = bz2.BZ2File(f + ".bz2", 'w')
> > while True:
> > data = x.read(1024000)
> > time.sleep(0.1)
> > if not data:
> > break
> > y.write(data)
> > time.sleep(0.1)
> > y.close()
> > x.close()
> > else:
> > return
>
> One of the issues you may be running into is memory. Under windows,
> using up 90% of the CPU shouldn't affect GUI performance (much) but
> swapping does. According to the bzip2 man page, the maximum block size
> is 900KB so you might be running into problems reading your file 1024KB
> at a time. Use the system monitor control panel to check for excessive
> swapping. Bzip2 uses 8x<blocksize> memory. So with the default setting
> of a 900KB block size, you are looking at 7.2M + some bookeeping memory.
>
> Another issue is that you might be better off downloading bzip2 for
> windows and letting the gnu bzip2 implementation handle file input and
> output. Using a shell command here might be more efficient in spite of
> spawning a new process.
>
> A third issue is that bzip2 achieves high compression efficiency at the
> expense of CPU time and memory. It might be worth considering whether
> gzip might occupy the sweet spot compromise between minimal archive size
> and minimal cpu usage.
>
> Fourth, how many of those files are uncompressible? I've noticed that
> bzip2 tries really hard to eek out some form of savings from
> uncompressible files. A filename filter for files that should not be
> compressed (png, jpg, gif, sx*) might be worth doing here.

Thanks for the tips. I installed 512MB of ECC Ram and the problem went away.



Relevant Pages

  • Re: bz2 & cpu usage
    ... > of the cpu and the Windows GUI becomes almost useless. ... One of the issues you may be running into is memory. ... Bzip2 uses 8xmemory. ...
    (comp.lang.python)
  • Multithreading motivation
    ... A lot of research has gone into tighter compression ratios at the ... This gives each CPU eight "hardware" threads. ... I took an 88MB test file and compressed it with bzip2, ... gzip, and pbzip2, and measured the time it took for all three. ...
    (comp.compression)
  • Re: NXP Gone mad
    ... snip ... ... bzip2 has been a standard for compression for many years, but it is little known in the windows world. ... This is mainly because of the fundamental difference in philosophy of how problems are solved in the windows world, and how they are solved in the *nix world - with windows, you have a single program doing lots of things, and with *nix, you have lots of programs doing single things. ...
    (comp.arch.embedded)
  • Re: Whats known as the best text compression methed for regular text files?
    ... - bzip2 will compress larger text files very well, ... latency and uses up a good bit of memory, and lots of CPU. ... really good compression, ...
    (sci.crypt)
  • Re: 2003 Server slowed to a crawl
    ... click Processes then CPU column to sort by ... Investigate a possible Network problem (is there a broadcast storm ... >> method for telling the computer's processor that it needs attention. ... >> which have signed drivers (such hardware is sold with a Microsoft Windows ...
    (microsoft.public.windows.server.general)