NNTP binary attachment downloader with asyncore and generators

From: fishboy (fishboy_at_spamspamspam.com)
Date: 05/31/04


Date: Mon, 31 May 2004 08:43:34 GMT

Howdy,

I'm in middle of a personal project. Eventually it will download
multipart binary attachments and look for missing parts on other
servers. And so far I've got it to walk a newsgroup and download and
decode single part binaries.

I thought I'd post the code and see what people think. I'd appreciate
any feedback. It's my first program with generators and I'm worried
I'm making this twice and hard as it needs to be.

Thanks,
David Fisher

Oh yeah, email is fake since I'm deathly afraid of spam. Please post
replies here.

Code was working when I posted it. Just change the server name,
username, password, group info at the bottom to something less
virtual. :)

#!/usr/bin/env python2.3
#
import asyncore
import socket
import os
import uu
import email
#
class Newser(asyncore.dispatcher):

    def __init__(self, host,port,user,password,group):
        asyncore.dispatcher.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.connect( (host,port) )
        self.buffer = ''
        self.user = user
        self.password = password
        self.group = group
        self.n = 0
        self.head = ''
        self.body = ''
        self.inbuffer = ''
        self.dataline = ''
        self.handleline = self.handleline_gen()

    def handle_connect(self):
        pass

    def handle_close(self):
        pass
        
    def writable(self):
        return (len(self.buffer) > 0)
    
    def handle_write(self):
        print 'sending: ' + self.buffer.strip()
        sent = self.send(self.buffer)
        self.buffer = self.buffer[sent:]
        if self.buffer:
            print 'didnt send whole line' #does this ever happen?
            print 'didnt send whole line' #just getting my attention
            print 'didnt send whole line' #in case it does

    def handle_read(self):
        self.inbuffer += self.recv(8192)
        while 1:
            n = self.inbuffer.find('\r\n')
            if n > -1:
                self.dataline = self.inbuffer[:n+2]
                self.inbuffer = self.inbuffer[n+2:]
                try:
                    result = self.handleline.next()
                    if result == 'OK':
                        pass # everything is groovy
                    elif result == 'DONE':
                        self.del_channel() # group walk is finished
                        break
                    else:
                        print 'something has gone wrong!'
                        print result
                        print self.dataline
                        self.del_channel()
                        break
                except StopIteration:
                    print 'should never be here'
                    print 'why did my generator run out?'
                    print 'why god? why?!'
                    print self.dataline
                    self.del_channel()
                    break
            else:
                break

    def handleline_gen(self):
        #
        # handshakey stuff
        # welcome username password group
        # after this is set we'll start the message walk
        #
        if self.dataline[:3] == '200': # welcome, post ok
            print self.dataline.strip()
            self.buffer = 'authinfo user ' + self.user + '\r\n'
            yield 'OK'
        else:
            yield 'WTF?! fail welcome? god hates me!'
        #
        if self.dataline[:3] == '381': # more auth needed
            print self.dataline.strip()
            self.buffer = 'authinfo pass ' + self.password + '\r\n'
            yield 'OK'
        else:
            yield 'WTF?! fail authinfo user'
        #
        if self.dataline[:3] == '281': # auth ok, go to town!
            print self.dataline.strip()
            self.buffer = 'group ' + self.group + '\r\n'
            yield 'OK'
        else:
            yield 'WTF?! fail authinfo pass'
        #
        if self.dataline[:3] == '211': # group
            print self.dataline.strip()
            self.buffer = 'next\r\n'
            yield 'OK'
        else:
            yield 'WTF?! fail group'
        #
        # main state loop
        # walk from one message to the next
        # issuing HEAD and BODY for each
        # never reenter here after we receive '421', no next article
        # so we should never issue StopIterator
        #
        while 1:
            #
            if self.dataline[:3] == '223': # next
                print self.dataline.strip()
                self.buffer = 'head\r\n'
                yield 'OK'
            elif self.dataline[:3] == '421': # err, no next article
                yield 'DONE'
            else:
                yield 'WTF?! fail next'
            #
            if self.dataline[:3] == '221': # head
                print self.dataline.strip()
                self.head = ''
                yield 'OK'
                # XXX what am I going to do if the server explodes
                while self.dataline <> '.\r\n':
                    self.head += self.dataline
                    yield 'OK'
                # XXX parse headers here
                # XXX decide whether we want body
                self.buffer = 'body\r\n'
                yield 'OK'
            else:
                yield 'WTF?! fail head'
            #
            if self.dataline[:3] == '222': # body
                print self.dataline.strip()
                self.body = ''
                yield 'OK'
                # XXX what am I going to do if the server explodes
                while self.dataline <> '.\r\n':
                    # XXX line-by-line decode here (someday)
                    self.body += self.dataline
                    yield 'OK'
                self.decode()
                self.buffer = 'next\r\n'
                yield 'OK'
            else:
                yield 'WTF?! fail body'

    def decode(self):
        """decode message body.
        try UU first, just decode body
        then mime, decode head+body
        save in tempfile if fail"""
        tempname = 'temp' + `self.n` + '.txt'
        self.n += 1
        file(tempname,'wb').write(self.body)
        f = file(tempname)
        try:
            uu.decode(f)
        except Exception,v:
            print 'uu failed code: ',v
            print 'trying MIME'
            file(tempname,'wb').write(self.head+self.body)
            f = file(tempname)
            message = email.message_from_file(f)
            for part in message.walk():
                print part.get_content_type()
                filename = part.get_filename()
                if filename:
                    if not os.path.isfile(filename):

file(filename,'wb').write(part.get_payload(decode=True))
                        print 'yay! MIME!'
                        os.remove(tempname)
                    else:
                        print "oops, we've already got one"
        else:
            print 'yay! UU!'
            os.remove(tempname)
            
def main():
    mynews = Newser('news.server',119,'fishboy','pass','alt.binaries')
    try:
        asyncore.loop()
    except KeyboardInterrupt:
        mynews.del_channel()
        print 'yay! I quit!'

if __name__ == '__main__':
    main()



Relevant Pages

  • Re: Update just quits with no errors (continued)
    ... to download in background and notify me when they're ready. ... >> throttled and happened to be too busy to handle the download request. ... >> "Servers are busy, download will continue in the background and you ...
    (microsoft.public.windowsupdate)
  • Re: Update just quits with no errors (continued)
    ... "Servers are busy, download will continue in the background and you will be notified when complete. ... > In the log there's this message repeated "Update not allowed> due to regulation". ...
    (microsoft.public.windowsupdate)
  • Re: Weird beneficial side effect of SP2 for OE
    ... (the original download and the later deletion). ... It will also help if you keep track of the Xref: ... > NNTP servers, at least from my part of the company's internal network. ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)
  • Re: Fedora Unity release
    ... default to notice that one or more servers consistently have connection ... Jigdo should drop a server after two different sets of two attempts per ... What I like about jigdo is that it only needs to download what's ... If you have a copy of the original F8 ISO, ...
    (Fedora)
  • Re: Its been quiet here
    ... Your internet service provider may give you some level of Usenet newsgroup access, but that free access seems to be waning since many people don't use it and providers are all trying to cut costs. ... A year or so ago that changed but I don't know if Comcast bought them or just decided to host their own newsgroup servers. ... Usenet is still a big channel for download of movies, music, audio books, etc., and hosting all those binary attachments for thousands of newsgroups takes a lot of storage. ...
    (rec.pyrotechnics)