Re: A design problem



En Thu, 31 Jan 2008 01:57:41 -0200, Dan Upton <upton@xxxxxxxxxxxx> escribió:

Or: How to write Python like a Python programmer, not a Java
programmer. This will be a little long-winded...

So I just recently started picking up Python, mostly learning the new
bits I need via Google and otherwise cobbling together the functions
I've already written. It occurred to me though that one of my
programs was still probably written very much like I would in Java
(part of the reason I'm picking up Python is I'm tired of my coworkers
making fun of me for writing parsing/reformatting programs in Java).

Maybe you've already read this, but I'll post the links anyway:
http://dirtsimple.org/2004/12/python-is-not-java.html
http://dirtsimple.org/2004/12/java-is-not-python-either.html

Anyway, basically here's the problem I have:

-Fork off n copies of a program, where n is a command line parameter,
and save their PIDs. The way I've been accomplishing this is
basically:

processes=[]
for i in range(numProcs):
pid=os.fork()
if pid == 0:
# do the forking
else:
processes.append(pid)

Looks fine to me.

-Every so much time (say, every second), I want to ask the OS
something about that process from under /proc/pid (this is on Linux),
including what core it's on.
while 1:
for i in processes:
file = open("/proc/"+str(i)+"/stat")

(I hope there is a time.sleep(1) after processing that)
- don't use file as a variable name, you're shadowing the builtin file type.
- "i" is very much overloaded as a variable name everywhere... I'd use pid instead
- string interpolation looks better (and is faster, but that's not so relevant here)
for pid in processes:
statfile = open("/proc/%d/stat" % pid)

From that, one of the pieces of data I'll get is which core it's
running on, which then will prompt me to open another file.
Ultimately, I want to have n files, that are a bunch of lines:
corenum data1 data2 ...
corenum data1 data2 ...
...

and so on. The way I was going to approach it was to every time
through the loop, read the data for one of the processes, open its
file, write out to it, and close it, then do the same for the next
process, and so on. Really though I don't need to be able to look at
the data until the processes are finished, and it would be less I/O,
at the expense of memory, to just save all of the lists of data as I
go along and then dump them out to disk at the end of the Python
program's execution. I feel like Python's lists or dictionaries
should be useful here, but I'm not really sure how to apply them,
particularly in a "Python-like" way.

For anybody who made it all the way through that description ;) any suggestions?

The simplest solution would be to use a tuple to store a row of data. You know (implicitely) what every element contains: the first item is "corenum", the second item is "data1", the third item is "data2" and so on... (based on your example above).
Collect those tuples (rows) into a list (one list per process), and collect all lists into a dictionary indexed by pid.

That is, at the beginning, create an empty dictionary:
info = {}

After each forking, at the same time you save the pid, create the empty list:
info[pid] = []

After you read and process the /proc file to obtain what you want, apend a new element to that list:
info[pid].append((corenum, data1, data2, ...))
(notice the double parenthesis)

At the end, write all that info on disk. The csv module looks like a good candidate:

import csv

for pid in processes:
writer = csv.writer(open("process-%d.csv" % pid, "wb"))
writer.writerows(info[pid])

That's all
--
Gabriel Genellina

.



Relevant Pages

  • Re: why cannot assign to function call
    ... I'm going to follow up here at the risk of annoying Mark, ... helpful in explaining things to Python beginners. ... it becomes a namespace mapping names to objects. ... to the list itself, while Steven held that Python lists ...
    (comp.lang.python)
  • python-dev Summary for 2003-10-16 through 2003-11-15
    ... This is the twenty-eighth and twenty-ninth summaries written by Brett ... The in-development version of the documentation for Python can be found ... If you have ever wanted to see linked lists used in Python in a rather ... Contributing threads: ...
    (comp.lang.python)
  • TOC of Python Cookbook now online (was Re: author index for Python Cookbook 2?)
    ... Processing a String One Character at a Time ... Finding a File on the Python Search Path ... Constructing Lists with List Comprehensions ... Looping over Items and Their Indices in a Sequence ...
    (comp.lang.python)
  • chapter4
    ... The for statement in Python differs a bit from what you may be used to ... list or a string), in the order that they appear in the sequence. ... (this can only happen for mutable sequence types, such as lists). ... The keyword def introduces a function definition. ...
    (Ubuntu)
  • Re: More pythonic shell sort?
    ... So I decided to tackle this old school problem with the python mindset. ... for gap in self.gapSeq: ... I didn't really think of pop and insert as advanced features. ... that they are python lists, ...
    (comp.lang.python)