Re: file reading by record separator (not line by line)



In <1180614374.027569.235540@xxxxxxxxxxxxxxxxxxxxxxxxxxx>, Lee Sander
wrote:

Dear all,
I would like to read a really huge file that looks like this:

name1....
line_11
line_12
line_13
...
name2 ...
line_21
line_22
...
etc

where line_ij is just a free form text on that line.

how can i read file so that every time i do a "read()" i get exactly
one record
up to the next ">"

There was just recently a thread with a `itertools.groupby()` solution.
Something like this:

from itertools import count, groupby, imap
from operator import itemgetter

def mark_records(lines):
counter = 0
for line in lines:
if line.startswith('>'):
counter += 1
yield (counter, line)


def iter_records(lines):
fst = itemgetter(0)
snd = itemgetter(1)
for dummy, record_lines in groupby(mark_records(lines), fst):
yield imap(snd, record_lines)


def main():
source = """\
name1....
line_11
line_12
line_13
....
name2 ...
line_21
line_22
....""".splitlines()

for record in iter_records(source):
print 'Start of record...'
for line in record:
print ':', line

Ciao,
Marc 'BlackJack' Rintsch
.



Relevant Pages

  • Re: Just for fun: Countdown numbers game solver
    ... Ops is vanilla, cleverops only returns canonical expressions as defined in Dan's email. ... def getop: return 'n' if isinstance ... yield x + y, ... "Return all ways of reaching target with nums" ...
    (comp.lang.python)
  • Re: Nested generator caveat
    ... Here's what's actually going on in your generator. ... def gen1: ... yield i, gen1 ... the for loop in gen0 is suspended each iteration while we do some ...
    (comp.lang.python)
  • Re: can Python be useful as functional?
    ... def sieve: ... I also know that Python got some useful tool such as map, filter, ... These should be almost the same: listprimes actually lists prime ...
    (comp.lang.python)
  • Re: Ordered Sets
    ... to store the key itself in the Node or the list. ... script that stores only prev and next. ... def discard: ... yield start ...
    (comp.lang.python)
  • Re: pre-PEP: Simple Thunks
    ... change Your thunk somehow e.g. from ... def yield_thunk: ... This is the reason why those ...
    (comp.lang.python)