Re: teaching a child - console or GUI
From: J French (erewhon_at_nowhere.com)
Date: 07/29/04
- Next message: J French: "Re: teaching a child - console or GUI"
- Previous message: Gary: "Modal form behaviour"
- In reply to: Marco van de Voort: "Re: teaching a child - console or GUI"
- Next in thread: Marco van de Voort: "Re: teaching a child - console or GUI"
- Reply: Marco van de Voort: "Re: teaching a child - console or GUI"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 29 Jul 2004 08:17:55 +0000 (UTC)
On Wed, 28 Jul 2004 15:23:12 +0000 (UTC), Marco van de Voort
<marcov@stack.nl> wrote:
>On 2004-07-28, J French <erewhon@nowhere.com> wrote:
>
>>>>
>>>> What bit of it does not perform
>>>
>>>Unfinished sentence on my part, I think. I think I meant that the
>>>application doesn't perform int 21h's, or any other legacy technique.
>>
>> I understood you meant 'perform' in the sense of work well enough
>> - eg: too slow
>Definitely not :-)
Uh ...
>>>> - or rather what is it that makes it slow if you keep most of the data
>>>> on disk ?
>>>
>>>Filters/queries over 6 million objects (in 8 entities/tables) must be in an
>>>acceptable time. Multiple users might run queries at the same time, but not
>>>many (say 1-4 users)
>>
>> I can see that, however one can make a system 'learn'
>> eg: a simple query can store its results in a BitMap that it slaps to
>> disk under a file name that is ... well the query
>
>Sure. But the queries here are not likely to reoccur before the next update.
Right
>> It does not need to be bitmaps, sparse results (eg: all 2003 records)
>> can be just a list of 4 byte pointers
>
>This could be done for us too. Keep in mem even. However my own situation
>doesn't benefit from this.
Not even if you can 'add' known data sets together
eg: All 2004 transactions if Corporate clients sorted by Alpha
>> A place I once worked used to thrive on selling databases to financial
>> institutions (they still do) and we developed a whole load of ways of
>> accelerating searching and sorting
>
>Why bother? Taking an RDBMDS should make things easier, not more difficult.
I am not proposing an RDBMS
- I'm not convinced they make things faster
- mostly they just save time through working on the server rather than
passing gigs of raw data through the network
>>>Also note that 6 million objects are a lot more lines in a RDBMS, because you
>>>have all kind of coupling tables for 1 to many relations.
>>
>> Are these objects very complex, or are they really a bunch of pointers
>
>pointers, strings, dates,
Right ...
>>>The use of indexes is limited, or you really need a lot of indexes.
>>
>> At its simplest an 'index' is only a list of sorted 4 byte pointers
>> One can hold a lot of those on disk ...
>
>And loading them in a system under load (with constant disk io) is worse
>than the _real_ querytime in our system.
surprisingly little, because one is doing very few large disk reads
rather than thousands of small disk reads
>Keep in mind that we keep the total mem<->disk bandwidth free. (except
>for some really minor logging/spooling). In a RDBMS this bandwidth is already
>under stress.
Yes - almost a diskless system
>Implementing tricks to make a RDBMS compete with an in-mem solution is not
>smart, since similar tricks benefit the in-mem solution too (and usually
>more)
Sure they do - however since memory is finite ....
>>>No, 5 years max. And even that is unlikely. They probably could do with 2,3
>>>years real data, and global stats from the other 2,3 years. But it is not
>>>worthwhile to code that.
>>
>> They could just pull up the 5 years minus system from a DVD
>> I also 'purge' my files on my main system, in the knowledge that the
>> clients have numerous archive data sets
>
>Yes, of course the data is retained, but it is no longer online. Moreover,
>the main app is single .exe, and the CPU power needed is +/- 2GHz. Memory
>req is now 800 MB, but that grows 300MB/year.
Sounds pretty much like home user kit !
>Having a temp machine for some special purpose (e.g. analysis by a trainee)
>only requires adding some memory to some old server or workstation and copying
>data + server .exe on it.
>>>However that 300MB/year figure and the five year figure is the
>>>current situation.
>>
>> Any chance of it going wild ?
>
>Not in the coming 1/2 years. The system is build to scale to 64-bit. Even
>if it contained all of Holland, 16GB would do the trick.
Right....
>Moreover the data is quite partitionable, so a cluster solution is also
>possible (though not prefered)
You mean cluster of PCs - yes I also wondered about that
Personally, if going down thate route, I would have one for preparing
an ordered list of selected records, and another for pumping the data
back to the client
>>>We didn't know the exact sizes and date ranges yet when we made the
>>>decision. At the time we were afraid of hitting the Delphi limit of 2-3 GB.
>>>It was more 780MB/year in the initial version, and application memory
>>>on top of that. Improved indexing, and packing some data decreased the
>>>size of the data.
>>
>> eg: trashing white space and tokenizing longer fields I guess
>
>Pretty much half of it yes, whitespace, tokenizing, string2datetime etc.
Right, I wonder whether you have looked into replacing the string
system - I should imagine the data is pretty repetitive
>Rethinking the container system was the other half.
>> It sounds as if you have some raw files that you crunch and slap into
>> RAM - rather like building a CD 'database'
>We start from .DBFs of the old system. What is a CD database according to
>you?
To me a CD database is a collection of R/O files that have been
heavily pre-processed so that one has numerous sort orders stored as
lists of pointers on disk, extract files of frequent search fields in
a normalized format .... basically any trick to make searching and
sorting a matter of adding/removing/merging pre-formed sets of data
>>>> Yes, well as we get older, we get craftier
>>>> Maybe it is time to look at it again
>>>
>>>I had to be convinced too. (by my collegue). But now I have done a few projects with the mentality,
>>>I wonder why I never saw it myself.
>>
>> You mean that you had to be convinced of the 'CD in RAM' approach ?
>> It kind of makes sense, but it /is/ possible to emulate that with very
>> little speed hit by making the App think that it is just using RAM,
>> while it is really using a large 'window' onto a file
>
>You mean memory mapping ?
By you - not by the machine
>> After all, your current system myst be paging memory in and out, even
>> if you have a heavily chipped up 'data server'
>
>Nope. Mem costs Eur 250/GB. We simply bought 2 GB. That's less than a
>programmer costs a week.
I sincerely hope so
>>>We are still thinking in a DOS way about memory I think. Memory as precious
>>>resource. It is a commodity now.
>>
>> You mean you are, or I am, or both of us ?
>
>Programmers in general.
I'm not so sure from looking at the horrors some coders come up with,
but even so, I do prefer to be mean with memory.
>> I agree it is a commodity, but if anything, that is the problem
>> It stops people looking at the underlying data structure
>
>The underlying datastructure is what is in memory. Trying to stuff it
>in a RDBMS, and then making it more complex is what is unnatural.
I really was not advocating a conventional RDBMS
>> I can, for example, see that if your data is what I think it is, then
>> you could have multiple copies of the relevant stuff sorted in
>> different order.
>
>Sure. But there are a lot of crazy optimisations that one could do. The
>fundamental question remains. Why would I ?
It could improve performance several hundred fold
Just using a BChop on a sorted list is many times faster than
sequentially scanning a list
>I only have to make sure that mutations are journaled and flushed. I don't
>need complex transactional support.
Yes, I had figured that
>> I really would not use a 3rd party RDBMS for anything (unless
>> seriously well paid) however building ones own filing system is not
>> particularly hard, and is rather interesting
>>
>> Your system has rather caught my interest, probably because it sounds
>> similar to problems I've worked on in the past.
>>
>> I really do believe that the key to speed is algorithms, not RAM
>
>That's a common mistake. It is an equation, and algorithms is a variable
>in that. Language, compiler speed, hardware are all variables too.
True - but the wrong algorithm can have a dramatic effect
>If your algorithms totally suck, it is the limiting factor sure. But it
>makes no sense to build an own custom system, while one weeks of wages can
>pay for the hardware to run it.
Yes - but once you have the hardware it seems to make sense to get
things going faster
>Of course a fundamental difference is if you sell the same product again and
>again, or, like me, use a single instance only for an inhouse job.
Hmm... perhaps ... not sure
- after all speed /is/ a major factor for you
I was interested in several things you mentioned, the Strings are
rather interesting - I'm assuming AnsiStrings here not effectively
arrays of Chars.
>From digesting data in the past I have generally found that 'String
Fields' tend to be very repetitive, and that one is often better off
just having a 4 byte pointer into a 'Lexicon'
Another thing I think I mentioned earlier is that 'RAM' devices are
getting very large and very cheap, one could literally stick a few
Memory Stick devices into some USB ports and get a vast amount of very
fast 'near RAM'
Sounds an interesting system
- thanks for putting up with my curiousity
- Next message: J French: "Re: teaching a child - console or GUI"
- Previous message: Gary: "Modal form behaviour"
- In reply to: Marco van de Voort: "Re: teaching a child - console or GUI"
- Next in thread: Marco van de Voort: "Re: teaching a child - console or GUI"
- Reply: Marco van de Voort: "Re: teaching a child - console or GUI"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|