Re: Python for large projects

From: Bill Rubenstein (wsr2_at_swbell.net)
Date: 03/28/04


Date: Sun, 28 Mar 2004 14:49:58 GMT

In article <mailman.357.1080147061.742.python-list@python.org>, gabor@z10n.net
says...
> On Wed, 2004-03-24 at 15:16, Bill Rubenstein wrote:
> > ...snip...
> > > > other thing is, that in the projects i work on, there seems to be
> > > > very hard to do unit tests
> > ...snip...
> >
> > The ability to do unit testing should not be an afterthought. It should be
> > considered as a major influence on the architecture of a project.
> >
> > If one cannot do proper unit testing, the architecture of the project is
> > questionable.
>
> ok, so let's use a specific example:
>
> imagine you're building a library, which fetches webpages.
>
> you have a library which can fetch 1 webpage at a time, but it is a
> synchronous library (like wget). you call him, and he returns the page.
>
> but you want an async one.
>
> so you decide to build a threadpool, where every thread will do this:
> look into a queue, and if there is a new URL to fetch, fetches it with
> his wget-like library, and saves the html page somewhere (and maybe
> signals something).
>
> and now the user who uses your library, simply adds the URL to fetch,
> and can check later asynchronously whether they are already fetched or
> not.
>
> could you tell me what unit tests would you create for this example?
>
>
> (a more generic request: is there on the internet a webpage with
> something like this? one where they have some complex
> modules/programs/algorithms, and they show how to write unittests for
> them?)
>
> thanks,
> gabor
>
>
>
Ok, I think I understand what the job is so, here is a try.

I'm assuming that this async wget's job is to start at a url, fetch it, track
down and fetch any links and such, get them, and make all of that available on
the local system for later viewing.

To make it testable, I'd design so that the application part of the system
(described above) has as limited a knowledge of its surroundings as possible --
except for the actual work performed. It should have no knowledge of a gui, for
instance.

Instead it should know about an object which represents a 'job'. This object
should have attributes and/or functions which can be accessed to find out the
base URL, the current status or state of the specific job (not started, in
progress (various states here),..., complete. There should be a log associated
with the job object where both normal and abnormal stuff can be kept. It should
also be able to provide information about the user if there is one, instructions
about the base URL, where in the local file system to store the results, etc.
During the development phase this job object is going to be a bit dynamic as new
needs for it are discovered.

There should probably be one object which can keep track of all of the job
objects and is responsible for creating new ones and deleting old ones.

All of the interfaces to the job management object and the job object need to be
formalized and properly documented. This whole subsystem can be tested, then, by
a test driver requesting services via the documented interfaces, changing the
state of a job via the documented interfaces and determining that the state
transitions are as expected. There is no need to fetch any real URLs to do this,
just pretend you did. This test driver also needs to exercise the interfaces
intended for use by a gui.

Now, as to testing the actual application code -- I'd think that you'd need a set
of URLs which would return known and stable results and a number of error
situations (bad links and such) to test against. Then a test driver would be
written to use the standard interfaces to the job management object and the job
object to schedule work against those URLs, determine when that work is done and
test that the results are as expected, highlight the differences between a prior
run against the particular URL and the current run, etc.

I've been retired for years but that was pretty much how we did it. There were
two small programming teams -- one writing application code against the formal
interface documentation and one writing test scaffolding against the same
documentation and building test cases. Things worked, the bug rate was very low,
implementation changes were localized and testable...

Anyway, it worked for us and we never had to claim that we just couldn't test
something except in production.

Bill



Relevant Pages

  • Re: Calling Excel from c++
    ... All Office components export automation interfaces. ... The Good News is that these are pretty well documented. ... The Bad News is that the documentation is mindlessly hard to use, ...
    (microsoft.public.vc.mfc)
  • Re: Statically AND Dynamically Typed Language ??
    ... right that such `interfaces' make a good way to structure documentation. ... In Lisp, for the purposes of writing manuals, one usually documents ... Python largely follows the language-supported division of code into ...
    (comp.lang.misc)
  • Re: c-client/imap4r1.h (C)
    ... These interfaces are documented to some degress in the internal.txt file which is included in the UW IMAP toolkit distribution. ... If you have a third party distribution of c-client, ask the third party where the documentation files and sample programs are located. ...
    (comp.mail.imap)
  • Re: DreamScripter question re: import module
    ... > For direct support from Developer Express, ... >> Their documentation doesn't specify any particular derivation dependencies ... >> that their interfaces should work fine even for basic types like integers ... >> properties, and variable declarations.) ...
    (borland.public.delphi.thirdpartytools.general)
  • Re: DUnit TestCase wizard is useless
    ... I think testing at class level is interesting, but not really good OOP ... I also test properties when I use unit testing, ... At the same time, my UML and interfaces much more seldom change, means ... private methods etc. is overkill IMO. ...
    (borland.public.delphi.non-technical)