Re: looking for some size optimization
- From: xhoster@xxxxxxxxx
- Date: 15 Apr 2007 22:27:33 GMT
espie@xxxxxxxxx wrote:
I'm looking at a script that handles a huge amount of data... basically,
the filenames from +4000 packages in order to recognize conflicts.
How many more than 4000 do you have? To take up the amount of room you
are talking about, it seems you would need orders of magnitude more
than 4000. But maybe I don't understand the nature of the data. What does
a conflict end up being represented as?
Currently, it builds a big hash through a loop that constructs
$all_conflict like this:
my $file= File::Spec->canonpath($self->fullname());
push ${$all_conflict->{$file}}, $pkgname;
Is $pkgname the name of the package declared in $file, or the name
of the package used in $file. In any event, are you pushing the same
$pkgname onto same file's list over and over? Maybe you should use a hash
of hash instead of hash of array:
$all_conflict->{$file}->{$pkname}=();
I end up with a hash of 250M.
I can't really use Devel::Size with any benefit, since all the data
is one single chunk (all the other data in the program amount to <2~3M)
I expect the $pkgname strings to be shared.
Hash keys are shared (with substantial overhead) but array values are not.
If your paths are really so long, you might try compressing them, or
digesting them.
....
I'm looking for bright ideas to try and reduce the size used... without
being too detrimental in terms of speed...
Without knowing more about the exact nature of the data, it is hard to say.
optimization that involved changing the way a structure is built is often
drive by the way the structure is used, so showing that part may also be
useful.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
.
- References:
- looking for some size optimization
- From: Marc Espie
- looking for some size optimization
- Prev by Date: Re: How to transparently download multiple files?
- Next by Date: Re: perl.h seems to interfere with fopen or stdio.h
- Previous by thread: Re: looking for some size optimization
- Next by thread: FAQ 1.1 What is Perl?
- Index(es):
Relevant Pages
|
|