Re: * 'struct-like' list *



"Ernesto" <erniedude@xxxxxxxxx> wrote in message
news:1139245389.529742.317110@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I'm still fairly new to python, so I need some guidance here...

I have a text file with lots of data. I only need some of the data. I
want to put the useful data into an [array of] struct-like
mechanism(s). The text file looks something like this:

[BUNCH OF NOT-USEFUL DATA....]

Name: David
Age: 108 Birthday: 061095 SocialSecurity: 476892771999

[MORE USELESS DATA....]

Name........

I would like to have an array of "structs." Each struct has

struct Person{
string Name;
int Age;
int Birhtday;
int SS;
}

I want to go through the file, filling up my list of structs.

My problems are:

1. How to search for the keywords "Name:", "Age:", etc. in the file...
2. How to implement some organized "list of lists" for the data
structure.

Any help is much appreciated.

Ernesto -

Since you are searching for keywords and matching fields, and trying to
populate data structures as you go, this sounds like a good fit for
pyparsing. Pyparsing as built-in features for scanning through text and
extracting data, with suitably named data fields for accessing later.

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul

------------------------------------------------
from pyparsing import *

inputData = """[BUNCH OF NOT-USEFUL DATA....]

Name: David
Age: 108 Birthday: 061095 SocialSecurity: 476892771999

[MORE USELESS DATA....]

Name: Fred
Age: 101 Birthday: 061065 SocialSecurity: 587903882000

[MORE USELESS DATA....]

Name: Barney
Age: 99 Birthday: 061265 SocialSecurity: 698014993111

[MORE USELESS DATA....]

"""

dob = Word(nums,exact=6)
# this matches your sample data, but I think SSN's are only 9 digits long
socsecnum = Word(nums,exact=12)

# define the personalData pattern - use results names to associate
# field names with matched tokens, can then access data as if they were
# attributes on an object
personalData = ( "Name:" + empty + restOfLine.setResultsName("Name") +
"Age:" + Word(nums).setResultsName("Age") +
"Birthday:" + dob.setResultsName("Birthday") +
"SocialSecurity:" + socsecnum.setResultsName("SS") )

# use personData.scanString to scan through the input, returning the
matching
# tokens, and their respective start/end locations in the string
for person,s,e in personalData.scanString(inputData):
print "Name:", person.Name
print "Age:", person.Age
print "DOB:", person.Birthday
print "SSN:", person.SS
print

# or use a list comp to scan the whole file, and return your Person data,
giving you
# your requested array of "structs" - not really structs, but ParseResults
objects
persons = [person for person,s,e in personalData.scanString(inputData)]

# or convert to Python dict's, which some people prefer to pyparsing's
ParseResults
persons = [dict(p) for p,s,e in personalData.scanString(inputData)]
print persons[0]
print

# or create an array of Person objects, as suggested in previous postings
class Person(object):
def __init__(self,parseResults):
self.__dict__.update(dict(parseResults))

def __str__(self):
return "Person(%s, %s, %s, %s)" %
(self.Name,self.Age,self.Birthday,self.SS)

persons = [Person(p) for p,s,e in personalData.scanString(inputData)]
for p in persons:
print p.Name,"->",p

--------------------------------------
prints out:
Name: David
Age: 108
DOB: 061095
SSN: 476892771999

Name: Fred
Age: 101
DOB: 061065
SSN: 587903882000

Name: Barney
Age: 99
DOB: 061265
SSN: 698014993111

{'SS': '476892771999', 'Age': '108', 'Birthday': '061095', 'Name': 'David'}

David -> Person(David, 108, 061095, 476892771999)
Fred -> Person(Fred, 101, 061065, 587903882000)
Barney -> Person(Barney, 99, 061265, 698014993111)



.



Relevant Pages

  • Re: nonhomogenous structs (was: lisp performance questions and observations)
    ... We programmers tend to view the instruction level of an architecture ... This is reinforced by the limited hardware courses we ... The conclusion is that in general, vectors of non-homogenous structs ... > naturally represented as an array of structs, ...
    (comp.lang.lisp)
  • Re: C# and C++ (again)
    ... The answer is to use classes, not structs. ... No need to break apart a portion of an array ... >> language where I will be most productive. ... it would be very nice to treat this 27 element array ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: * struct-like list *
    ... Age: 108 Birthday: 061095 SocialSecurity: 476892771999 ... You don't normally want to do real structs in python. ... persondata = extract_info) ... ssdict keys: ...
    (comp.lang.python)
  • Re: array of struct from c++ to c#
    ... I currently have a single struct being passed from a C++ dll to a C# ... I'm now trying to pass an array of structs back to ...
    (microsoft.public.dotnet.framework.interop)
  • Re: Allocating structs on the stack
    ... > I am preparing for a class this fall that teaches C# programming. ... One big academic issue to me is that use of structs in ... > an array, in which case the rule seems to not be implemented). ... The compiler doesn't know whether an array element has been fully ...
    (microsoft.public.dotnet.languages.csharp)