Re: Simple text parsing gets difficult when line continues to next line




Tim Hochberg wrote:
[snip]
I agree that mixing the line assembly and parsing is probably a mistake
although using next explicitly is fine as long as your careful with it.
For instance, I would be wary to use the mixed for-loop, next strategy
that some of the previous posts suggested. Here's a different,
generator-based implementation of the same idea that, for better or for
worse is considerably less verbose:

[snip]

Here's a somewhat less verbose version of the state machine gadget.

def continue_join_3(linesin):
linesout = []
buff = ""
pending = 0
for line in linesin:
# remove *all* trailing whitespace
line = line.rstrip()
if line.endswith('_'):
buff += line[:-1]
pending = 1
else:
linesout.append(buff + line)
buff = ""
pending = 0
if pending:
raise ValueError("last line is continued: %r" % line)
return linesout

FWIW, it works all the way back to Python 2.1

Cheers,
John,

.



Relevant Pages