Re: Match two file with CompareMem

From: Dr John Stockton (spam_at_merlyn.demon.co.uk)
Date: 12/05/03


Date: Fri, 5 Dec 2003 20:39:41 +0000

JRS: In article <3fce13f3$1@newsgroups.borland.com>, seen in news:borla
nd.public.delphi.language.objectpascal, John Herbster (TeamB) <herb-
sci1_AT_sbcglobal.net@?.?> posted at Wed, 3 Dec 2003 10:38:43 :-
>
>"Roman Krejci" <faccount@rksolution.nospam.cz> wrote
>> ...
>> Personally, I prefer comparing checksums (CRC's)
>
>And computing checksums, CRCs, digital digests, or whatever
>you call them can be done without reading both files at the
>same time -- thus in some cases tying up fewer computer
>resources. --JohnH

If the need is only to establish identity or otherwise :

First, compare the sizes. Files of different sizes are not the same.
The act of getting sizes, in any recent OS, will cache data needed if
the files then have to be opened.

If the files have arbitrary properties, compare block-by-block. Most
files are not the same, so the comparison will probably give a result in
the first block.

If the files are likely only to differ in the later parts, do as above,
but backwards (choosing blocks to match multiple sectors, of course).

If the files are likely to differ only in the middle, start in the
middle.

For multiple files :

List the files and their sizes; sort by size.

Any file that is of a different size to both its neighbours is unique;
discard.

Sets of two files, now compare as above.

Sets of several files, compute checksums, sort by checksum, discard odd-
men-out.

If several is greater that three, one can compare pairwise for remaining
sets between two and several in number.

For unresolved cases, either compare pairwise or repeat as above with a
different checksum.
 

-- 
 © John Stockton, Surrey, UK.  ?@merlyn.demon.co.uk   Delphi 3   Turnpike 4 ©
 <URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/&c., FAQqy topics & links;
 <URL:http://www.bancoems.com/CompLangPascalDelphiMisc-MiniFAQ.htm> clpdmFAQ;
 <URL:http://www.borland.com/newsgroups/guide.html> news:borland.* Guidelines


Relevant Pages

  • Re: Checksum and Objects
    ... I can't use CompObj because I need to compare data offline. ... their laptop computer is synced to the file server. ... it took 20 seconds to generate all the checksums in a temp table. ...
    (microsoft.public.fox.programmer.exchange)
  • Re: Shorter checksum than MD5
    ... > How about putting a timestamp in each record, ... > compare the records that have been updated since the last period ... Every night I would 'SELECT' all checksums ... cursor.execute("SELECT ID, md5sum, 0 FROM ARTIKEL;") ...
    (comp.lang.python)
  • Re: case sensitive filenames
    ... It takes checksums of files and records the file name, ... Perhaps you should compare canonical File object representations of the ... And since Roedy is on windows, IIRC, not one that will trip him up very often! ... When do you suppose we'll get a font-sensitive file system? ...
    (comp.lang.java.programmer)
  • Re: case sensitive filenames
    ... It takes checksums of files and records the file name, ... Perhaps you should compare canonical File object representations of the ... Even that doesn't necessarily help. ... And since Roedy is on windows, IIRC, not one that will trip him up very often! ...
    (comp.lang.java.programmer)
  • Re: Distinct linear orderings on Z
    ... > product of the set domain and the set density. ... Can we or can we not compare sizes of arbitrary sets or do you believe ... To avoid wasting his time with unlucky ...
    (sci.math)

Loading