Re: Program inefficiency?
- From: thebjorn <BjornSteinarFjeldPettersen@xxxxxxxxx>
- Date: Sat, 29 Sep 2007 10:33:54 -0700
On Sep 29, 5:22 pm, hall.j...@xxxxxxxxx wrote:
I wrote the following simple program to loop through our help files
and fix some errors (in case you can't see the subtle RE search that's
happening, we're replacing spaces in bookmarks with _'s)
the program works great except for one thing. It's significantly
slower through the later files in the search then through the early
ones... Before anyone criticizes, I recognize that that middle section
could be simplified with a for loop... I just haven't cleaned it
up...
The problem is that the first 300 files take about 10-15 seconds and
the last 300 take about 2 minutes... If we do more than about 1500
files in one run, it just hangs up and never finishes...
Is there a solution here that I'm missing? What am I doing that is so
inefficient?
Ugh, that was entirely too many regexps for my taste :-)
How about something like:
def attr_ndx_iter(txt, attribute):
"Return all the start and end indices for the values of
attribute."
txt = txt.lower()
attribute = attribute.lower() + '='
alen = len(attribute)
chunks = txt.split(attribute)
if len(chunks) == 1:
return
start = len(chunks[0]) + alen
end = -1
for chunk in chunks[1:]:
qchar = chunk[0]
end = start + chunk.index(qchar, 1)
yield start + 1, end
start += len(chunk) + alen
def substr_map(txt, indices, fn):
"Apply fn to text within indices."
res = []
cur = 0
for i,j in indices:
res.append(txt[cur:i])
res.append(fn(txt[i:j]))
cur = j
res.append(txt[cur:])
return ''.join(res)
def transform(s):
"The transformation to do on the attribute values."
return s.replace(' ', '_')
def zap_spaces(txt, *attributes):
for attr in attributes:
txt = substr_map(txt, attr_ndx_iter(txt, attr), transform)
return txt
def mass_replace():
import sys
w = sys.stdout.write
for f in open(r'pathname\editfile.txt'):
try:
open(f, 'w').write(zap_spaces(open(f).read(), 'href',
'name'))
w('.') # progress-meter :-)
except:
print 'Error processing file:', f
minimally-tested'ly y'rs
-- bjorn
.
- Follow-Ups:
- Re: Program inefficiency?
- From: Pablo Ziliani
- Re: Program inefficiency?
- References:
- Program inefficiency?
- From: hall . jeff
- Program inefficiency?
- Prev by Date: xyz points and magnitude to intensity or colormap or contourmap
- Next by Date: xml modifications
- Previous by thread: Re: Program inefficiency?
- Next by thread: Re: Program inefficiency?
- Index(es):
Relevant Pages
|