Re: A design problem
- From: "Gabriel Genellina" <gagsl-py2@xxxxxxxxxxxx>
- Date: Thu, 31 Jan 2008 02:35:23 -0200
En Thu, 31 Jan 2008 01:57:41 -0200, Dan Upton <upton@xxxxxxxxxxxx> escribió:
Or: How to write Python like a Python programmer, not a Java
programmer. This will be a little long-winded...
So I just recently started picking up Python, mostly learning the new
bits I need via Google and otherwise cobbling together the functions
I've already written. It occurred to me though that one of my
programs was still probably written very much like I would in Java
(part of the reason I'm picking up Python is I'm tired of my coworkers
making fun of me for writing parsing/reformatting programs in Java).
Maybe you've already read this, but I'll post the links anyway:
http://dirtsimple.org/2004/12/python-is-not-java.html
http://dirtsimple.org/2004/12/java-is-not-python-either.html
Anyway, basically here's the problem I have:
-Fork off n copies of a program, where n is a command line parameter,
and save their PIDs. The way I've been accomplishing this is
basically:
processes=[]
for i in range(numProcs):
pid=os.fork()
if pid == 0:
# do the forking
else:
processes.append(pid)
Looks fine to me.
-Every so much time (say, every second), I want to ask the OS
something about that process from under /proc/pid (this is on Linux),
including what core it's on.
while 1:
for i in processes:
file = open("/proc/"+str(i)+"/stat")
(I hope there is a time.sleep(1) after processing that)
- don't use file as a variable name, you're shadowing the builtin file type.
- "i" is very much overloaded as a variable name everywhere... I'd use pid instead
- string interpolation looks better (and is faster, but that's not so relevant here)
for pid in processes:
statfile = open("/proc/%d/stat" % pid)
From that, one of the pieces of data I'll get is which core it'srunning on, which then will prompt me to open another file.
Ultimately, I want to have n files, that are a bunch of lines:
corenum data1 data2 ...
corenum data1 data2 ...
...
and so on. The way I was going to approach it was to every time
through the loop, read the data for one of the processes, open its
file, write out to it, and close it, then do the same for the next
process, and so on. Really though I don't need to be able to look at
the data until the processes are finished, and it would be less I/O,
at the expense of memory, to just save all of the lists of data as I
go along and then dump them out to disk at the end of the Python
program's execution. I feel like Python's lists or dictionaries
should be useful here, but I'm not really sure how to apply them,
particularly in a "Python-like" way.
For anybody who made it all the way through that description ;) any suggestions?
The simplest solution would be to use a tuple to store a row of data. You know (implicitely) what every element contains: the first item is "corenum", the second item is "data1", the third item is "data2" and so on... (based on your example above).
Collect those tuples (rows) into a list (one list per process), and collect all lists into a dictionary indexed by pid.
That is, at the beginning, create an empty dictionary:
info = {}
After each forking, at the same time you save the pid, create the empty list:
info[pid] = []
After you read and process the /proc file to obtain what you want, apend a new element to that list:
info[pid].append((corenum, data1, data2, ...))
(notice the double parenthesis)
At the end, write all that info on disk. The csv module looks like a good candidate:
import csv
for pid in processes:
writer = csv.writer(open("process-%d.csv" % pid, "wb"))
writer.writerows(info[pid])
That's all
--
Gabriel Genellina
.
- Follow-Ups:
- Write Python like a Python programmer (was: A design problem)
- From: Ben Finney
- Write Python like a Python programmer (was: A design problem)
- Prev by Date: Re: HI all
- Next by Date: Online Debugging
- Previous by thread: Re: A design problem
- Next by thread: Write Python like a Python programmer (was: A design problem)
- Index(es):
Relevant Pages
|