Text fingerprinting



Hi

The problem that i have is something as follows:
We would like to find similiarity for text that has been copied from a
source. However, comparing the whole text would not be feasible (so
string matching algorithms are not useful) and so we would like to
generate some kind of a fingerprint for the texts which can be compared
against the stored corpus of fingerprints to detect copying.

I know of other techniques like watermarking etc. but are not very
useful since one who copies can easily remove such things.

Thanks in advance for any replies and/or pointers to resources.

best regards
sumedh

.



Relevant Pages

  • Re: Text fingerprinting
    ... > generate some kind of a fingerprint for the texts which can be compared ... > against the stored corpus of fingerprints to detect copying. ... I have heard of software comparing text for high correlation (i.e. ...
    (comp.theory)
  • Re: Biometrics
    ... The article only shows someone copying a fingerprint, ... made with a fingerprint recognition device. ... I was comparing it to the fact of cutting someones finger, ...
    (Security-Basics)
  • Re: diy 6-acre contour map
    ... very interesting set of replies and attitudes. ... Comparing it to a builder's transit is like comparing ... reflectorless feature is pretty handy as well. ...
    (sci.engr.surveying)
  • Re: How to Mute Mic during TAPI call
    ... > comparing PHONECAPS.dwPhoneFeatures to the phone feature constants ... if you want to check whether a feature (e.g. ... ..dwMonitoredHandsetHookSwitchModes ... * Please post all messages and replies to the newsgroup so all may ...
    (microsoft.public.win32.programmer.tapi)
  • Re: [stupidly long spam line]
    ... It's like farting. ... Even if nobody smells it, it's good to get it out. ... Comparing your posts to stinky hot gas - yeah, ...
    (rec.humor)