Fast Data Comparison (dict v. list. v string)

From: Scott Brady Drummonds (scott.b.drummonds.nospam_at_intel.com)
Date: 04/14/04


Date: Wed, 14 Apr 2004 14:01:44 -0700

Hi, everyone,

I'm a relative novice at Python and am working on some optimizations in my
first Python project. At its core, this tool I'm writing needs to perform
many comparisons to fixed-size lists of fixed-length strings.

Currently, my implementation uses dictionaries to store each string. I
maintain a separate "key-mapping" dictionary that maps keys from one
dictionary to the other. Then, all I have to do to compare the two
dictionaries is as follows:
  for metaKey in keyMap.keys():
    if dict1[metaKey] != dict2[keyMap[metaKey]]:
      # Do some processing

Since the sizes of the dictionaries never change, I tried implementing this
using lists. The solution looks something like this (and assumes that a
pre-processing phase has sorted the contents of each list so their indexes
are the same):
  for i in len(list1):
    if list1[i] != list2[i]:
      # Do some processing

As it turns out, this implementation appears to be about 33% faster than the
dictionary-based one. Now, assuming that the datum being stored at each
index can fit into one character, I could do a string-based implementation
like this:
  for i in len(string1):
    if string1[i] != string[i]:
      # Do some processing

This last solution actually runs about the same as the dictionary, which
takes 50% longer than the list implementation.

Now, my questions are:
1) Does anyone have another suggestion as to how I can organize these data
so that I can compare many elements many times?
2) Is there a string comparison operator that will return which indexes
have different values? Maybe it would be faster than the iterative
comparison approach for the third implementation.
3) Since my data are changing between the successive executions of the code
snippets above, I need a way of having the other parts of the program update
it. But, it appears that strings are constant as I can't assign individual
characters with "string1[i] = '0'". Is there any way around this?

Thanks!
Scott

-- 
Remove ".nospam" from the user ID in my e-mail to reply via e-mail.


Relevant Pages

  • Re: comments on my design of a new language?
    ... > exist, plus strings and the function type, and that is it. ... indexed by a string or a regular expression, ... > dictionaries, that terminate with literals, or (possibly ... and lists. ...
    (comp.lang.misc)
  • Re: comments on my design of a new language?
    ... > indexed by a string or a regular expression, ... Trees are essentially nested dictionaries, ... I was originally going to treat scalars and unit lists equivalently, ... > dictionaries being indexed by strings or by regular expressions. ...
    (comp.lang.misc)
  • len() and PEP 3000
    ... I suggest that at least lists, tupples, sets, dictionaries and strings ...
    (comp.lang.python)
  • Dictionary 2
    ... Each dictionary contains a XML element attributes. ... key is the couple of strings that identifies an attribute, ... I now understand that I may need to compare two dictionaries ...
    (comp.programming)
  • Re: Thoughts about Python
    ... I know that tuples can be used as keys... ... Mutable keys in dictionaries either need to be ... Using lists forced to strings would be a difficult sell even without the ...
    (comp.lang.python)