Hex editor display - can this be more pythonic?



Hi:

I'm building a hex line editor as a first real Python programming exercise.

Yesterday I posted about how to print the hex bytes of a string. There are two decent options:

ln = '\x00\x01\xFF 456\x0889abcde~'
import sys
for c in ln:
sys.stdout.write( '%.2X ' % ord(c) )

or this:

sys.stdout.write( ' '.join( ['%.2X' % ord(c) for c in ln] ) + ' ' )

Either of these produces the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E

I find the former more readable and simpler. The latter however has a slight advantage in not putting a space at the end unless I really want it. But which is more pythonic?

The next step consists of printing out the ASCII printable characters. I have devised the following silliness:

printable = ' 1!2@3#4$5%6^7&8*9(0)aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ\
`~-_=+\\|[{]};:\'",<.>/?'
for c in ln:
if c in printable: sys.stdout.write(c)
else: sys.stdout.write('.')

print

Which when following the list comprehension based code above, produces the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E ... 456.89abcde~

I had considered using the .translate() method of strings, however this would require a larger translation table than my printable string. I was also using the .find() method of the printable string before realizing I could use 'in' here as well.

I'd like to display the non-printable characters differently, since they can't be distinguished from genuine period '.' characters. Thus, I may use ANSI escape sequences like:

for c in ln:
if c in printable: sys.stdout.write(c)
else:
sys.stdout.write('\x1B[31m.')
sys.stdout.write('\x1B[0m')

print


I'm also toying with the idea of showing hex bytes together with their ASCII representations, since I've often found it a chore to figure out which hex byte to change if I wanted to edit a certain ASCII char. Thus, I might display data something like this:

00(\0) 01() FF() 20( ) 34(4) 35(5) 36(6) 08(\b) 38(8) 39(9) 61(a) 62(b) 63(c) 64(d) 65(e) 7E(~)

Where printing chars are shown in parenthesis, characters with Python escape sequences will be shown as their escapes in parens., while non-printing chars with no escapes will be shown with nothing in parens.

Or perhaps a two-line output with offset addresses under the data. So many possibilities!


Thanks for input!




--
_____________________
Christopher R. Carlen
crobc@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
SuSE 9.1 Linux 2.6.5
.



Relevant Pages

  • Entering strings as user input but interpreting as Python input (sort of)
    ... I would want the quoted part to be interpreted as if I entered it into Python itself as: ... In other words, if a quoted string occurs in the user input, I want only that part to be treated as a Python string. ... The point of this in the context of the hex editor is that the user should be able to enter hex bytes without qualifications like "0xXX" but rather as simply: "0A 1B 2C" etc. but also be able to input a string without having to type in hex ASCII codes. ...
    (comp.lang.python)
  • Re: Python doesnt see the directories I create
    ... Note you *didn't* try paths with double slashes, you merely correctly represented the paths with single slashes :-) ... In most languages - Python included - the antislash is used for escape sequences. ... A string literal cannot end with a single backslash, as it escapes the closing quote. ...
    (comp.lang.python)
  • Re: How to stop print printing spaces?
    ... I've conjured up the idea of building a hex line editor as a first real Python programming exercise. ... which I get if I omit the space in the format string above. ...
    (comp.lang.python)
  • Re: n00bie wants advice.
    ... has only six hex numbers otherwise the results get rather large. ... Indenting is normally 4 spaces in Python ... Use string formatting for better readability. ...
    (comp.lang.python)
  • Re: socket send query
    ... > if i send the hex data values with a socket send then python assumes i ... > their corresponding hex values. ... It sounds to me like you have a string with two characters per byte ...
    (comp.lang.python)