Re: teaching a child - console or GUI
From: Marco van de Voort (marcov_at_stack.nl)
Date: 07/28/04
- Next message: Dodgy: "Re: D7 IDE Editor short cuts"
- Previous message: Chris L.: "D7 IDE Editor short cuts"
- In reply to: J French: "Re: teaching a child - console or GUI"
- Next in thread: J French: "Re: teaching a child - console or GUI"
- Reply: J French: "Re: teaching a child - console or GUI"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 28 Jul 2004 15:23:12 +0000 (UTC)
On 2004-07-28, J French <erewhon@nowhere.com> wrote:
>>>
>>> What bit of it does not perform
>>
>>Unfinished sentence on my part, I think. I think I meant that the
>>application doesn't perform int 21h's, or any other legacy technique.
>
> I understood you meant 'perform' in the sense of work well enough
> - eg: too slow
Definitely not :-)
>>> - or rather what is it that makes it slow if you keep most of the data
>>> on disk ?
>>
>>Filters/queries over 6 million objects (in 8 entities/tables) must be in an
>>acceptable time. Multiple users might run queries at the same time, but not
>>many (say 1-4 users)
>
> I can see that, however one can make a system 'learn'
> eg: a simple query can store its results in a BitMap that it slaps to
> disk under a file name that is ... well the query
Sure. But the queries here are not likely to reoccur before the next update.
> It does not need to be bitmaps, sparse results (eg: all 2003 records)
> can be just a list of 4 byte pointers
This could be done for us too. Keep in mem even. However my own situation
doesn't benefit from this.
> A place I once worked used to thrive on selling databases to financial
> institutions (they still do) and we developed a whole load of ways of
> accelerating searching and sorting
Why bother? Taking an RDBMDS should make things easier, not more difficult.
>>Also note that 6 million objects are a lot more lines in a RDBMS, because you
>>have all kind of coupling tables for 1 to many relations.
>
> Are these objects very complex, or are they really a bunch of pointers
pointers, strings, dates,
>>The use of indexes is limited, or you really need a lot of indexes.
>
> At its simplest an 'index' is only a list of sorted 4 byte pointers
> One can hold a lot of those on disk ...
And loading them in a system under load (with constant disk io) is worse
than the _real_ querytime in our system.
Keep in mind that we keep the total mem<->disk bandwidth free. (except
for some really minor logging/spooling). In a RDBMS this bandwidth is already
under stress.
Implementing tricks to make a RDBMS compete with an in-mem solution is not
smart, since similar tricks benefit the in-mem solution too (and usually
more)
>>No, 5 years max. And even that is unlikely. They probably could do with 2,3
>>years real data, and global stats from the other 2,3 years. But it is not
>>worthwhile to code that.
>
> They could just pull up the 5 years minus system from a DVD
> I also 'purge' my files on my main system, in the knowledge that the
> clients have numerous archive data sets
Yes, of course the data is retained, but it is no longer online. Moreover,
the main app is single .exe, and the CPU power needed is +/- 2GHz. Memory
req is now 800 MB, but that grows 300MB/year.
Having a temp machine for some special purpose (e.g. analysis by a trainee)
only requires adding some memory to some old server or workstation and copying
data + server .exe on it.
>>However that 300MB/year figure and the five year figure is the
>>current situation.
>
> Any chance of it going wild ?
Not in the coming 1/2 years. The system is build to scale to 64-bit. Even
if it contained all of Holland, 16GB would do the trick.
Moreover the data is quite partitionable, so a cluster solution is also
possible (though not prefered)
>>We didn't know the exact sizes and date ranges yet when we made the
>>decision. At the time we were afraid of hitting the Delphi limit of 2-3 GB.
>>It was more 780MB/year in the initial version, and application memory
>>on top of that. Improved indexing, and packing some data decreased the
>>size of the data.
>
> eg: trashing white space and tokenizing longer fields I guess
Pretty much half of it yes, whitespace, tokenizing, string2datetime etc.
Rethinking the container system was the other half.
> It sounds as if you have some raw files that you crunch and slap into
> RAM - rather like building a CD 'database'
We start from .DBFs of the old system. What is a CD database according to
you?
>>> Yes, well as we get older, we get craftier
>>> Maybe it is time to look at it again
>>
>>I had to be convinced too. (by my collegue). But now I have done a few projects with the mentality,
>>I wonder why I never saw it myself.
>
> You mean that you had to be convinced of the 'CD in RAM' approach ?
> It kind of makes sense, but it /is/ possible to emulate that with very
> little speed hit by making the App think that it is just using RAM,
> while it is really using a large 'window' onto a file
You mean memory mapping ?
> After all, your current system myst be paging memory in and out, even
> if you have a heavily chipped up 'data server'
Nope. Mem costs Eur 250/GB. We simply bought 2 GB. That's less than a
programmer costs a week.
>>We are still thinking in a DOS way about memory I think. Memory as precious
>>resource. It is a commodity now.
>
> You mean you are, or I am, or both of us ?
Programmers in general.
> I agree it is a commodity, but if anything, that is the problem
> It stops people looking at the underlying data structure
The underlying datastructure is what is in memory. Trying to stuff it
in a RDBMS, and then making it more complex is what is unnatural.
> I can, for example, see that if your data is what I think it is, then
> you could have multiple copies of the relevant stuff sorted in
> different order.
Sure. But there are a lot of crazy optimisations that one could do. The
fundamental question remains. Why would I ?
I only have to make sure that mutations are journaled and flushed. I don't
need complex transactional support.
> I really would not use a 3rd party RDBMS for anything (unless
> seriously well paid) however building ones own filing system is not
> particularly hard, and is rather interesting
>
> Your system has rather caught my interest, probably because it sounds
> similar to problems I've worked on in the past.
>
> I really do believe that the key to speed is algorithms, not RAM
That's a common mistake. It is an equation, and algorithms is a variable
in that. Language, compiler speed, hardware are all variables too.
If your algorithms totally suck, it is the limiting factor sure. But it
makes no sense to build an own custom system, while one weeks of wages can
pay for the hardware to run it.
Of course a fundamental difference is if you sell the same product again and
again, or, like me, use a single instance only for an inhouse job.
- Next message: Dodgy: "Re: D7 IDE Editor short cuts"
- Previous message: Chris L.: "D7 IDE Editor short cuts"
- In reply to: J French: "Re: teaching a child - console or GUI"
- Next in thread: J French: "Re: teaching a child - console or GUI"
- Reply: J French: "Re: teaching a child - console or GUI"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|