Re: eval to dict problems NEWB going crazy !



On Thu, 06 Jul 2006 03:34:32 -0700, manstey wrote:

Hi,

I have a text file called a.txt:

# comments
[('recId', 3), ('parse', {'pos': u'np', 'gen': u'm'})]
[('recId', 5), ('parse', {'pos': u'np', 'gen': u'm'})]
[('recId', 7 ), ('parse', {'pos': u'np', 'gen': u'm'})]

I read it using this:

filAnsMorph = codecs.open('a.txt', 'r', 'utf-8') # Initialise input
file
dicAnsMorph = {}
for line in filAnsMorph:
if line[0] != '#': # Get rid of comment lines
x = eval(line)
dicAnsMorph[x[0][1]] = x[1][1] # recid is key, parse dict is
value

But it crashes every time on x = eval(line). Why is this?

Some people have incorrectly suggested the solution is to remove the
newline from the end of the line. Others have already pointed out one
possible solution.

I'd like to ask, why are you using eval in the first place?

The problem with eval is that it is simultaneously too finicky and too
powerful. It is finicky -- it has problems with lines ending with a
carriage return, empty lines, and probably other things. But it is also
too powerful. Your program wants a specific piece of data, but eval
will accept any string which is a valid Python expression. eval is quite
capable of giving you a dictionary, or an int, or just about anything --
and, depending on your code, you might not find out for a long time,
leading to hard-to-debug bugs.

Is your data under your control? Could some malicious person inject data
into your file a.txt? If so, you should be aware of the security
implications:

# comment
[('recId', 3), ('parse', {'pos': u'np', 'gen': u'm'})]
[('recId', 5), ('parse', {'pos': u'np', 'gen': u'm'})]
# line injected by a malicious user
"__import__('os').system('echo if I were bad I could do worse')"
[('recId', 7 ), ('parse', {'pos': u'np', 'gen': u'm'})]

Now, if the malicious user can only damage their own system, maybe you
don't care -- but the security hole is there. Are you sure that no
malicious third party, given *only* write permission to the file a.txt,
could compromise your entire system?

Personally, I would never use eval on any string I didn't write myself. If
I was thinking about evaluating a user-string, I would always write a
function to parse the string and accept only the specific sort of data I
expected. In your case, a quick-and-dirty untested function might be:

def parse(s):
"""Parse string s, and return a two-item list like this:

[tuple(string, integer), tuple(string, dict(string: string)]
"""

def parse_tuple(s):
"""Parse a tuple with two items exactly."""
s = s.strip()
assert s.startswith("(")
assert s.endswith(")")
a, b = s[1:-1].split(",")
return (a.strip(), b.strip())

def parse_dict(s):
"""Parse a dict with two items exactly."""
s = s.strip()
assert s.startswith("{")
assert s.endswith("}")
a, b = s[1:-1].split(",")
key1, value1 = a.strip().split(":")
key2, value2 = b.strip().split(":")
return {key1.strip(): value1.strip(), key2.strip(): value2.strip()}

def parse_list(s):
"""Parse a list with two items exactly."""
s = s.strip()
assert s.startswith("[")
assert s.endswith("]")
a, b = s[1:-1].split(",")
return [a.strip(), b.strip()]

# Expected format is something like:
# [tuple(string, integer), tuple(string, dict(string: string)]
L = parse_list(s)
T0 = parse_tuple(L[0])
T1 = parse_tuple(L[1])
T0 = (T0[0], int(T0[1]))
T1 = (T1[0], parse_dict(T1[1]))
return [T0, T1]


That's a bit more work than eval, but I believe it is worth it.

--
Steven

.



Relevant Pages

  • Code Critique: check if int or float
    ... testing whether a string contains an integer or float. ... def i? ... # Strict test whether the string is a valid float. ... assert '45'.i? ...
    (comp.lang.ruby)
  • Re: Weird error during programme startup
    ... Microsoft Visual C++ Debug Library ... I've stepped through the innards of _initterm and it ... it won't assert in release builds. ... You said that you moved all strings to string tables. ...
    (microsoft.public.vc.language)
  • Re: Checking for null parameter
    ... Control-Flow Invariants ... so you don't use 'assert' to check them. ... One way to enforce that is to demand a non-null name in the constructor. ... public Employee(String name) ...
    (comp.lang.java.programmer)
  • Re: Why are there no ordered dictionaries?
    ... >> Do you really mean just re-ordering the keys without a corresponding reording of values?? ... style accesses, with assign, eval, and del available for the latter two. ... Some def test_usecase_xx: ... ...
    (comp.lang.python)
  • Re: #define on class static member
    ... WHenever you are doing malloc in a C++ program, ... carefully about why you are allocating this sort of string instead of using a type such as ... Note that in release mode you will never get an allocation, because ASSERT will simply ...
    (microsoft.public.vc.mfc)