Re: match data based on matching field



On Sat, 2006-10-28 at 08:59 -0600, Jack Daniels (Butch) wrote:
I work with library cataloguing records and am trying to accomplish this amazing feat.

Trying is no problem, amazing feat is eluding me

I need to find all records with the same ISBN number (field 020)save all matching records which will result in more than 1 record with the same ISBN, then somehow magically end up with only 1 record containing all 856 fields.

If there are not a huge number of isbn's then load a hash with an array
for each ISBN. Then access each element of the hash loop through the
array through the different input streams and use business logic to
merge the two inputs. Simplest approach is if field is empty use the
content provided.

More system oriented approach might be to write a quick parser of the
input files which adds the ISBN as the first key and a second key that
is the source. sort all files together by ISBN then input files. This
would allow you to prioritise where data comes.

This whole process can get very complex. My work has megalithic
programs at work that merges direct mail address lists together.

--
Ken Foskey
FOSS developer

.



Relevant Pages

  • Re: Two-dim array: Strange "cant use string as ARRAY ref"
    ... > I am using a two-dim hash with a string as first key, ... You can't use an array as a hash key. ...
    (comp.lang.perl.misc)
  • Duck Typed Concepts for Ruby (was Re: A use case for an ordered hash)
    ... An Sequencable mixin can be defined that implements all sorts of operations such as append, concat, splice, sort, etc. ... extending an instance of Array with Sorted if the array is known to be sorted. ... Now returning to the thread at hand we can see that the difference between the associative array and hash hierarchies is that the hash hierarchy depends upon Hashable keys. ...
    (comp.lang.ruby)
  • Re: Suggestions for double-hashing scheme
    ... >> The items that are being moved are the items in the hash table itself, ... >> which are of fixed size (they are in an array, ... > typedef struct { ... One "uchar" aka 'unsigned char' is plenty to hold a probe ...
    (comp.programming)
  • [SUMMARY] Index and Query (#54)
    ... This was a fun quiz for me because I really don't know anything about indexing ... We see in initializethat the index is just a Hash. ... an Array of symbolic document names where the word can be found ). ...
    (comp.lang.ruby)
  • Re: Comment on how to uniquely track your objects in C# / hash table / get hash code
    ... Array, correct? ... This is largely irrelevant to the issue of performance, since hash ... where both insertions and lookups happen frequently, ... about fast lookups for balanced red/black binary trees. ...
    (microsoft.public.dotnet.languages.csharp)