Re: Extracting patterns after matching a regex



On Sep 8, 12:16 pm, nn <prueba...@xxxxxxxxxxxxx> wrote:
On Sep 8, 11:19 am, Dave Angel <da...@xxxxxxxx> wrote:



Mart. wrote:
<snip>
I have been doing this to turn the email into a string

email =ys.argv[1]
f =open(email, 'r')
s =str(f.readlines())

so FTPHOST isn't the first element, it is just part of a larger
string. When I turn the email into a string it looks like...

'FINISHED: 09/07/2009 08:42:31\r\n', '\r\n', 'MEDIATYPE: FtpPull\r\n',
'MEDIAFORMAT: FILEFORMAT\r\n', 'FTPHOST: e4ftl01u.ecs.nasa.gov\r\n',
'FTPDIR: /PullDir/0301872638CySfQB\r\n', 'Ftp Pull Download Links: \r
\n', 'ftp://e4ftl01u.ecs.nasa.gov/PullDir/0301872638CySfQB\r\n', 'Down
load ZIP file of packaged order:\r\n',
<snip>

The mistake I see is trying to turn a list into a string, just so you
can try to parse it back again.  Just write a loop that iterates through
the list that readlines() returns.

DaveA

No kidding.

Instead of this:
s = str(f.readlines())

ftphost = re.search(r'FTPHOST: (.*?)\\r',s).group(1)
ftpdir  = re.search(r'FTPDIR: (.*?)\\r',s).group(1)
url = 'ftp://' + ftphost + ftpdir

I would have possibly done something like this (not tested):
lines = f.readlines()
header={}
for row in lines:
    key,sep,value = row.partition(':')[2].rstrip()
    header[key.lower()]=value
url = 'ftp://' + header['ftphost'] + header['ftpdir']

Well I said not tested that would be of course:
lines = f.readlines()
header={}
for row in lines:
key,sep,value = row.partition(':')
header[key.lower()]=value.rstrip()
url = 'ftp://' + header['ftphost'] + header['ftpdir']

.



Relevant Pages