Re: Programmer's unpaid overtime.
From: Edward G. Nilges (spinoza1111_at_yahoo.com)
Date: 12/06/03
- Next message: JP: "VS + Cygwin"
- Previous message: parv: "Re: And your favourite data structure is...."
- In reply to: Calum: "Re: Programmer's unpaid overtime."
- Next in thread: Richard Heathfield: "Re: Programmer's unpaid overtime."
- Reply: Richard Heathfield: "Re: Programmer's unpaid overtime."
- Reply: Calum: "Re: Programmer's unpaid overtime."
- Reply: Programmer Dude: "Re: Programmer's unpaid overtime."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 5 Dec 2003 22:41:27 -0800
Calum <calum.bulk@ntlworld.com> wrote in message news:<bqq4lm$onq$1@news7.svr.pol.co.uk>...
> Edward G. Nilges wrote:
> > Programmer Dude <Chris@Sonnack.com> wrote in message news:<3FCFABA6.FA73EF3E@Sonnack.com>...
> >
> > What we need to see are not these single benchmarks but instead the
> > execution time "curves" for the CJ and the EGN versions. There is a
> > possibility that the version I wrote quickly and posted had a flaw,
> > although my production version fails to exhibit NP complete behavior
> > for large files.
>
> It seems only fair that you post an equation for the execution time of
> your program to show us all how it's done.
If you cache the parse of individual words the time to find a random
word is D*S+(N-1)*K where D is the number of DISTINCT words, S is the
initial search time for any word using brute force, N is an average
number of times a typical word occurs and K is a constant that
represents the roughly constant time it takes, to look up a keyed item
in a hash table such as a Visual Basic collection.
If you use a brute force search for each word and each of its
occurences the formula becomes S*T where T is the total number of
words, both distinct and the same.
The paradox is that the latter formula S*T is "simpler" than the other
formula but represents more of what the computer science professor
means by complexity.
The cache solution's execution time formula is a function of the total
number of distinct words (necessarily smaller than the total number of
words) and the average times a random word repeats: the brute force
solution's execution time formula is a function of the word count.
The operator of lowest precedence, the "major" operator, of the cache
time formula D*S+(N-1)*K is addition. The operator of lowest
precedence of the brute force formula S*T is multiplication. Unless
something's hidden in the addition-based formula, this means that
total execution time grows "more" slowly for the cache solution, at
the expense of the use of more storage.
Chris Sonnack's numbers show what may be but probably isn't a problem
in this reasoning, but my main concern in the real world is not some
absolute efficiency of a specific program, but the efficiency "trace",
the efficiency "footprint" of a family of related solutions, which is
always the real entity in the system life cycle as the user's needs
change.
I was concerned that a brute force word parser, that for any arbitrary
word passed over the words to its left, would seem to work in testing
but fail when presented with real texts. That's because unlike some
academic I've seen what it means when the user presents an actual
large file, which actually slows "working" solutions to a crawl.
I was not completely happy with the usual solution to this problem
which is to transform all the words once and for all into an array,
because in real applications the words "on the left" are usually more
important. A financial news text coming over a wire and being parsed
for words of interest to the user such as "money", "stocks", "equity",
and "prison time", might not always be completely read, and the
appearance of words to the left is more important.
In fact, if the strings are unmanageably large in the real world, the
array will probably be.
I have an essentially verbal, and some may say verbose, understanding
of the problem. I am certain that Richard Heathfield et al. will find
it distasteful.
This is because without knowing it, they are engaged in a postmodern
attack on language itself, far more serious than the attack
neoconservatives think is being made.
In this attack, comp.programming is not about programmers (har!) or if
it is they are to be subject to a new law of omerta, and must be a
bruder schweigen.
The game becomes cutting pure code with an incommunicable
understanding of the problem which in my book isn't an understanding
at all.
The campaign is part of what I consider the deprivation of language by
politics in which narration and naming become a management prerogative
over which clowns fight in this ng.
As in the cases where Richard Heathfield has informed me that I can no
longer observe that (for example) that a community "has an
understanding", the game becomes to strive for a language deprived of
the ability to talk but at the dullest and most literal level.
Paradoxically and because we still must deal with abstractions, the
dull muttering that results is riven with self-contradiction and is in
fact a language of DENIAL.
This is for example on view in a recent CSpan televised hearing of a
meeting on military "transformation" in the United States.
The truth in the USA is that a gang of thugs and criminals is in
charge but one can't say this (any more that one can say that a large
program, in which the company has invested millions and on which yer
job depends, won't work).
The result is a curious language in which certain keywords are
uttered, not to push meaning forward but instead to give signals to
Authority that one is O.K.
One can never, in this type of environment, even name or speak of the
reverse of the favored term. For example, the Rumsfeld doctrine is
that the USA military should be "fast". Fast is good: slow is bad, and
no sleep until Baghdad or until the program works.
In this sort of environment, one can't sit back and make a case for a
slow offensive even though the Russians won a war by this type of
attack.
We are familiar with this bureaucratic reality in MIS. For example, if
a manager at a conference has heard about stateless objects, you don't
want to call your objects with state non-stateless even if that's an
important piece of documentation.
You don't want to call your .Net objects thread-unsafe even though
that's important information.
You can't, in other words, document, and management wonders why
"there's no documentation."
The problem is that the military brass and DP management are
subconsciously convinced that the era of mere human difference is
over. Because language is a trace of differences, this results in a
positive hatred of language.
I am reminded in this connection of the sort of nervous breakdown the
journal Computerworld had, in the 1990s. Because it was about data
processing and in the 1970s, its content would address the material
situation of people with midlevel jobs in MIS. It became a sort of
soapbox of discontent.
But, this may have upset powerful advertisers and the result is today
that its articles resemble, in their failure to acknowledge the truth
(that MIS people as working people will have interests in common with
upper management and shareholders, and interests that diverge),
articles in Pravda or Signal, the Nazi picture magazine.
Recent content includes an article claiming that the old IBM idea of
"best effort", in which IBM customer engineers were recognized and
compensated even if the customer's problem wasn't solvable, should be
retired in favor of something else, such as shooting the failures, I
guess. Another article recommended that MIS managers go to porn sites
to see how to create attention-getting Web pages...that don't ever go
away.
I think it would be far healthier for Computerworld to simply
acknowledge that real MIS employees (who often work without health
benefits or decent pay relative to their contribution) need to have
their own voice, perhaps in the form of a page for the help to at
least vent their concerns, whether about impossible deadlines, unpaid
overtime or offshore. As it is, the denial generates nonsense such as
I've identified, and the fundamental problem is one we see in this ng.
People can't speak honestly anymore.
This is the real terrorism, in my book.
>
> > The fact remains that you seem to confuse genuine analysis of software
> > complexity with a few benchmarks.
>
> Complexity? Tell me more! :-)
There is as I hope you know a whole bunch of useful acamodemic work on
software complexity which is the mathematical manipulation and
analysis of the execution time formulae of algorithms and actual
programs.
Chris Sonnack presented some interesting numbers. But I have no way of
auditing those numbers. He may have made some stupid error.
In the example, I was trying in fact to show how bad C sucks. This is
because it completely obscures the difference between code with state
and code without state.
If you create an array or cache in the C parser, you have to finger
out where to put it. If you put it in "open code" outside of any
function, where this area contained prior only #defines and structures
that did not allocate structure instances, you have a New Thing which
needs RAM and which within this RAM develops a state.
This Thing becomes responsible for managing different strings from
different callers. Over and above the individual cache for a string it
is responsible for storing and looking up different input strings.
Whereas there's a one-to-one relationship, tightly coupled, between a
PowerString instance and its string.
Idiots, who talk about complexity metrics and mean lines of code or If
nesting, need to understand mathematics including its limitations.
Because a piece of code is complex or simple only in relation to a
reader, you CANNOT measure software complexity, except as social
research of the crudest sort which affects the phenomenon it would
describe.
"My code is simple: you code is [way too] complex" is narcissism pure
and simple. And, while it doesn't exactly break my heart, companies
have lost big bucks because schools do not teach programmers that 90%
of their real job will be (1) understanding and then (2) changing
somebunny else's code. The result is that the newbie (1) rewrites the
code.
- Next message: JP: "VS + Cygwin"
- Previous message: parv: "Re: And your favourite data structure is...."
- In reply to: Calum: "Re: Programmer's unpaid overtime."
- Next in thread: Richard Heathfield: "Re: Programmer's unpaid overtime."
- Reply: Richard Heathfield: "Re: Programmer's unpaid overtime."
- Reply: Calum: "Re: Programmer's unpaid overtime."
- Reply: Programmer Dude: "Re: Programmer's unpaid overtime."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|