Re: sorting with expensive compares?
- From: bonono@xxxxxxxxx
- Date: 23 Dec 2005 09:20:55 -0800
Dan Stromberg wrote:
> On Thu, 22 Dec 2005 22:06:42 +0000, Dan Stromberg wrote:
> > Hi folks.
> > Python appears to have a good sort method, but when sorting array elements
> > that are very large, and hence have very expensive compares, is there some
> > sort of already-available sort function that will merge like elements into
> > a chain, so that they won't have to be recompared as many times?
> > Thanks!
> I probably should've been more specific.
> I'm wanting to sort a large number of files, like a bunch of output files
> from a large series of rsh or ssh outputs on a large series of distinct
> machines, a music collection in .ogg format (strictly redistributable and
> legally purchased music), a collection of .iso cdrom images (strictly
> redistributable and legally purchased software), and so forth.
> I'm not sorting the content of each file individually.
> I'm treating each file as a potentially very large string, and "sorting
> the strings".
> I've been using the following compare function, which in short checks, in
> 1) device number
> 2) inode number
> 3) file length
> 4) the beginning of the file
> 5) an md5 hash of the entire file
> 6) the entire file
> (If #1 and #2 are identical, then the file must be a hardlink to the other
> file. Also, files that do not have the same length can never be
> identical. And of course these items are generated on demand, making it
> frequently possible to avoid doing full-file-length compare on a lot of
> However, my program is still doing more #6 comparisons than seems
> strictly necessary when I could just toss all the filenames describing
> identical files into a list, and avoid re-comparing files with identical
> content over and over - I don't want to compare them to each other again
> and again), when there are a lot of identical files in the input list.
Why would #5 not enough as an indicator that the files are indentical ?
- Prev by Date: Re: Vaults of Parnassus hasn't been updated for months
- Next by Date: Re: How to get started in GUI Programming?
- Previous by thread: Re: sorting with expensive compares?
- Next by thread: Re: sorting with expensive compares?