Re: Just say no to threads [Was: Software architecture]
From: Ronald E Jeffries (ronjeffries_at_acm.org)
Date: 11/01/04
- Next message: David Lightstone: "Re: Just say no to threads [Was: Software architecture]"
- Previous message: CTips: "Re: Just say no to threads [Was: Software architecture]"
- In reply to: Debbie Craft: "Re: Just say no to threads [Was: Software architecture]"
- Next in thread: Phlip: "Re: Just say no to threads [Was: Software architecture]"
- Reply: Phlip: "Re: Just say no to threads [Was: Software architecture]"
- Reply: Daniel Parker: "Re: Just say no to threads [Was: Software architecture]"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 01 Nov 2004 06:53:47 -0500
On Sun, 31 Oct 2004 20:56:10 -0800, Debbie Craft <d145@yahoo.com>
wrote:
>Perhaps because we are told that someone who had experience at an ebay would
>have a chance to leverage that experience but there is no place in the process
>that actually happens because every decision is local.
Well, Debbie, I've read all your posts carefully before replying to
any. This seems to be near the latest in time, so I'll start here.
If I haven't understood your position, please correct me. What I'm
getting is that your big issue with XP is that you don't see how
experienced is leveraged. For example, you're concerned because some
XPers say that they wouldn't start with setting up a database even if
they knew there was going to be a database in the app.
And I'm getting that you don't see how a focus on local decisions
could result in applying one's broader experience.
Again, if these aren't qite right, please correct me, and read what
follows as answering not quite the right questions. Thanks.
First, a short resume. I've been doing software development since
1961, and my teams and I have written language compilers, operating
systems, relational database systems, and a host of applications. I
have Master's degrees in math and computer science. I've been in the
XP movement since its very beginning, and have trained and coached
many teams in how to do it, often with great success. I mention this
all just so you'll know that I'm not some kid, that I have some
experience with what I'm talking about, and that I have at least a
basic education in what software development is all about. I apologize
if all that sounds pompous or something, I just wanted to leave you
with a picture of where I'm coming from.
Now then. Let's imagine that we two database-experienced programmers
are setting out to build some app that we know is going to have a
database in it. To pick an example that I have been chatting with a
fellow about recently, let's say that we have a company that has a
large number of samples of some kind of material. Scientists come to
our company with specs for the materials they want to look at, and we
give them samples of our samples, according to what they want. We
charge them various amounts, depending on what the samples are, how
rare they are, and so on.
Our mission is to compute an inventory value for our samples. We'll
imagine that they aren't, as yet, in a database at all.
Now my colleague just spent a couple/three weeks designing an SQL
database for the samples. Each sample has various fields, describing
it, and there are separate tables containing common information to be
looked up, in that cute way that normalized relational databases have.
Then he got to work on the SQL, and after a while he had something
like seven really hairy SQL statements that computed the value of the
inventory.
When he and I were talking, our theme was how an XP team might do
really short iterations, deliver something that looks like real
business value to the customer, and still use all our knowledge while
evolving to a good final solution. Now I told him, and will tell you,
that in spite of having led the team that released the very first
time-sharing based fully relational database product, and two other
database products, I'm not sure that I would want to wind up with the
calculation being done by SQL statements. The reason is that those
statements, though very powerful, aren't all that easy to change and
optimize, and we know that as new samples are added to this database,
the algorithm for valuing them is going to change. And, frankly, I
just prefer to keep the real logic of the app in real code rather than
in stored procedures and such. Not that I wouldn't wind up with a
database: I'm assuming here that we will. But I'd use the DB more as a
storage and retrieval mechanism and a bit less as a calculation
mechanism, at least until the SQL code had stopped changing.
Now you and I might have a long chat about whether that was the best
way to go, and we'd have to decide. But you aren't here, and here's
what I would do.
I would ask the COO for whom we're doing this job two questions, and
get these two answers:
- How many samples do we have? About 15,000
- How much do you think the inventory is worth? We have it on the
books right now for $3,000,000.
Cool. Then I'd write a program that looks like the one below. I'm just
sketching here in this posting, but I would really write, compile, and
run this code and test:
in TestInventory class:
public void testFlatInventory() {
SampleCollection samples = new SampleCollection();
int value = samples.Value();
assertEquals(3000000, value);
}
in SampleCollection class:
public int Value() {
int value = 0;
foreach ( Sample sample in this ) {
value += sample.Value();
}
return value;
}
in Sample class:
public int Value() {
return 200;
}
(Now, really, I wouldn't go even that far for my first test, because
what I've got here shows SampleCollection as having real collection
capability, an iterator and such. I would probably start with an
ArrayList, either in the test, or inside SampleCollection class. My
purpose would be simple: to get that code actually running and passing
its test as quickly as possible: in ten minutes, if I can.)
You could probably do it faster than I could, but I couldn't get an
SQL system installed on my PC, set up a database of 15000 records,
whip them in and out, evaluate them all, and get it running in ten
minutes. So I'd do the first test something like the above, deferring
the use of SQL.
But here's something interesting: this code, as simple as it is,
actually begins to express how the app is going to be work: we're
going to process the samples one at a time and add up each sample's
value. This design decision reflects my understanding from the COO
that there are lots of different kinds of samples, each with its own
way of being evaluated.
**** Now, I am mindful that in the final application, it might
make more sense to have a more complex approach, perhaps partitioning
the samples into various types (the ones with a fixed value, the ones
with a value declining by age, the ones with value pegged to the prime
rate, whatever). I can see that if this was the case, I might have
several subcollections of samples and process each subcollection
separately. Who knows, with a decent operating system I might even
process them on separate threads and take advantage of my
multi-processor system with multiple I/O channels. But that's not for
today, and neither is the creation of the SQL. I think about those
things: I don't do them.
My focus is on creating value, on getting feedback from really
creating running code, and on feedback from the COO. If I cared to,
after this ten minute interval, I could go to the COO and say: OK,
here's version one, we treat each sample as worth $200 and add them
up. What shall we do now?
We don't know what he would say. He might focus on the fact that the
XYZ samples are worth way more than all the others, around $2000. He
might focus on the fact that there are actually 15,231 samples today
and that the number changes every day. Depending on what he values, I
would do different things to this program.
But would I put in a database? Not yet. Even if the next step required
me to get some real-world information, such as the real number of
samples, I wouldn't put that into SQL yet. I don't need to, and it's
not getting harder to do, because it'll always be exactly the same
work: define the schema, populate it. Hmm, populate it. How will we do
that?
Turns out they have a flat file of sample information, just some basic
information. Part of why they've called us in is that they know it
isn't enough info to value the inventory, or to do much of anything
else. Right now, when someone wants some samples, they go through the
flat file with EasyTrieve or something, pull out a list of sample
numbers that might be interesting, then go through the card files to
look at those samples, making a list on paper of the samples for this
client. I'm sure they'll want an app to do all that. Interesting.
So we'll grab that file and read it in, creating 15,000-odd instances
of Sample into our SampleCollection. We'll add that code to the
SampleCollection constructor. Our test fails, because 15,231 * 200
isn't 3,000,000. We fix the test, either by doing the math or changing
it to do samples.count*200. It doesn't matter much, but I think I'd do
the multiply, because I expect to be working on this program for a
couple of weeks, and for now at least, I'm running against the live
data.
OK, what now? Maybe I'd identify those $2000 samples and evaluate
them. The test would be to count them and adjust the expected value by
$1800 per each, and the code in Sample::Value turns into something
like:
public int Value() {
if (this.IsExpensive)
return 2000;
else
return 200;
}
Fascinating. We have an increasingly accurate estimation of value and
so far we've only been programming for a few minutes. And our
IsExpensive property will need to look at some of the fields of the
Sample. We're starting to find out which fields of the existing flat
file are useful, what they mean, and to encode that learning in our
own program.
As we continue doing this, we'll probably rather quickly come across
information that isn't in the flat file, but is only on the little
cards defining each sample. If we haven't already discussed this with
the COO, we'll get him started on figuring out how we're going to get
this information typed into a computer somewhere. We could do that in
a lot of different ways:
Maybe we do a little app in some Form-building language, and
create a new flat file;
Maybe we have them type the necessary information into Excel;
I don't know how we'll do it. Maybe we'll even do it in SQL, but I'm
not convinced that it's time for that. We can learn a lot, quickly, by
continuing to evolve this program.
**** Now at first I had a bit of a twinge here. Was I really about
to suggest that we have people go through the 15,000 cards and add
just a couple of fields to some data store, and then maybe do the
whole thing over again to add more fields? Well, honestly, we might.
It really wouldn't take much longer than doing it in one pass over the
cards, since the number of fields is just as important as the number
of cards.
But we might not. We might move at this point to creating some dummy
data, /based/ on the card contents, to work on the Value() method. If
we went that way, we would create a new test collection, perhaps with
just one each of each "kind" of sample. We would be finding out what
the member variables of the sample need to be: the fields of the card
that actually go into the value calculation.
At some early point, I'd expect that the Sample class might split into
various subclasses: the if statement is already pointing that way. I'd
let it split when the code told me that. If there are really only two
kinds of samples, I might put it off. As the kinds grow to three and
four, I'd be more inclined to create an abstract class and concrete
subclasses, or an interface and concrete implementations. But I'd let
reality tell me that.
And so it would go. At some point, we would either decide that the
right thing is to put all this into SQL, or we would not. Since in the
real application there are only 15,000 samples, and since a sample can
be described according to my colleague in about 200 bytes, we might
decide to keep them on some flat file and process them in memory. Or
we might decide, based on frequency of change, that SQL is the right
answer.
Either way, I'd expect the structure of the app to be supportive of
either decision, because we will have determined the important fields
of the records, from the member variables of our classes. We will have
a good idea how to partition the database of raw information, based on
our concrete classes. We'll know what, if any, complex SQL we need,
based on whatever search conditions we've used in our loops in the
program.
OK, long story, but I felt it necessary to get deep enough into the
app that we could see how the thinking works. Note that because I'm
rather familiar with data-based apps, I'm mindful from the beginning
of the kinds of things that will come up. Therefore I never really
move toward truly stupid designs, just moving through designs of
decreasing simplicity and increasing appropriateness. Note also that
because I am by experience reluctant to commit anything to database
schema until I think it's stable, that you and I might have some
discussions, and try some experiments, if you wanted to go to SQL
sooner.
Note also that if there is some newbie in the room who doesn't know
SQL even as well as I do, we can bring them up to speed. And if
there's some database expert on the team who really doesn't know how
to program in C#, we can pair with him on some of this early stuff and
give him some concrete simple experience, while he helps us be aware
of any nuances of the database issues, so that we don't program
ourselves into some stupid corner.
I really do work this way, and the teams I work with learn to work
this way as well. What happens is that they get bits of functionality
delivered much sooner; they get better feedback from their customer;
their customer realizes that they are working hard and working on his
problem, not some techie issues.
And they stop sooner! Maybe right after we do the $2000 samples, the
COO says "Hey, now that we have those ones done, you know we can just
do the time-variable samples and that value should be close enough to
satisfy the auditors!" So we focus on those, do them, and ship the
program.
All the time, we are mindful of all the design issues we are capable
of thinking about. If we're experienced, as you and I are, we think
about lots of issues. We put into the code those matters that actually
/matter/ in our joint, expert opinion.
We could, of course, do a bunch of design up front, as my colleague
did. We would do a better job than he did, because we're smarter than
he is, but we would still inevitably make mistakes and would almost
certainly do too much, especially if faced with what we've done, the
COO decides to do just the expensive ones and the time-variable ones.
So while I do think up front about what to do, I have found that a
balance where I turn to code to express my design ideas works better
for me. Having helped a lot of people learn how to do this, I find
that they choose a balance of moving to code more quickly as well.
My colleague didn't get back to the boss with running code for two or
three weeks. We can get back in a day. I'd rather do that.
So, there's a story of how it actually works. What are your thoughts
and concerns?
Thanks,
-- Ron Jeffries www.XProgramming.com I'm giving the best advice I have. You get to decide if it's true for you.
- Next message: David Lightstone: "Re: Just say no to threads [Was: Software architecture]"
- Previous message: CTips: "Re: Just say no to threads [Was: Software architecture]"
- In reply to: Debbie Craft: "Re: Just say no to threads [Was: Software architecture]"
- Next in thread: Phlip: "Re: Just say no to threads [Was: Software architecture]"
- Reply: Phlip: "Re: Just say no to threads [Was: Software architecture]"
- Reply: Daniel Parker: "Re: Just say no to threads [Was: Software architecture]"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|