Re: converting a sed / grep / awk / . . . bash pipe line into python



hofer wrote:

Something I have to do very often is filtering / transforming line
based file contents and storing the result in an array or a
dictionary.

Very often the functionallity exists already in form of a shell script
with sed / awk / grep , . . .
and I would like to have the same implementation in my script

What's a compact, efficient (no intermediate arrays generated /
regexps compiled only once) way in python
for such kind of 'pipe line'

Example 1 (in bash): (annotated with comment (thus not working) if
copied / pasted

cat file \ ### read from file
| sed 's/\.\..*//' \ ### remove '//' comments
| sed 's/#.*//' \ ### remove '#' comments
| grep -v '^\s*$' \ ### get rid of empty lines
| awk '{ print $1 + $2 " " $2 }' \ ### knowing, that all remaining
lines contain always at least
\ ### two integers calculate
sum and 'keep' second number
| grep '^42 ' ### keep lines for which sum is 42
| awk '{ print $2 }' ### print number
thanks in advance for any suggestions of how to code this (keeping the
comments)

for line in open("file"): # read from file
try:
a, b = map(int, line.split(None, 2)[:2]) # remove extra columns,
# convert to integer
except ValueError:
pass # remove comments, get rid of empty lines,
# skip lines with less than two integers
else:
# line did start with two integers
if a + b == 42: # keep lines for which the sum is 42
print b # print number

The hard part was keeping the comments ;)

Without them it looks better:

import sys
for line in sys.stdin:
try:
a, b = map(int, line.split(None, 2)[:2])
except ValueError:
pass
else:
if a + b == 42:
print b

Peter
.



Relevant Pages

  • Re: pg_dumpall shell script in bash shell
    ... I am attempting to write a set of automated BASH Shell Scripts ... names of the databases to feed them into a shell script array it feeds ... # These are the variables for use with the backup. ...
    (comp.unix.programmer)
  • Re: Variable variable name
    ... numbers first) in a shell script: ... I get the following error message: ... of $file in an array names AD_10K? ... quote the command to prevent filename expansion on the first ...
    (comp.unix.shell)
  • Re: Is there an simple way to initialise arrays in bulk?
    ... On 2009-01-22, Kenny McCormack wrote: ... I need to set up a large array in BEGIN. ... initialise it in one go in the same vein as this C example, ... Your shell script wouldn't have to be modified each time? ...
    (comp.lang.awk)
  • Re: mv and white spaces
    ... > I'm trying to write a little shell script to rename files by changing ... > But it doesn't work if there is a white space in filename. ... # form of the array name is replaced by each element of the ...
    (comp.unix.shell)
  • Re: Rich Comparisons Gotcha
    ... Traceback: ... The truth value of an array with more than one element is... ... It's this discrepancy that seems like a bug, not that a ValueError is raised in the former case, which is perfectly reasonable to me. ...
    (comp.lang.python)