Re: Reading million records
- From: "Shloma Baum" <test@xxxxxxxxx>
- Date: Wed, 4 Jul 2007 11:14:49 -0400
Excellent Post... thanks.
"m. Th." <a@xxxxx> wrote in message news:468a95b7@xxxxxxxxxxxxxxxxxxxxxxxxx
Sandeep Chandra wrote:I have a table with 1million+ records. Using FoxPro I am able to read all records in 8 seconds. Using Postgres it takes almost 2 minutes to read all the records.
How can I get same performance as FoxPro using Postgres or any other database?
Forget it. You simply cannot. :-) (...hehehe....)
FoxPro is what is called an ISAM DB - or a double-linked list manager - (in fact ISAM means Indexed-Sequential Access Manager) or in our common language - a "desktop database". These desktop RDBMS, are _way_ faster (several orders of magnitude) than the so called DB (SQL) servers, because of many factors mainly those ones related to multi-user support (concurrency control, transaction stuff aso) and, of course, network things (authentication, comm/protocol, multi-client handling aso.). This applies _especially_ when you have as your ISAM the FoxPro which, imho, is the best RDBMS arround, at least AFAIK (of course we are talking now about 'professional' desktop RDBMSes).
How did you the above benchmarks?
I suppose that for the FoxPro you did a USE... SCAN/ENDSCAN on your table and you measured the time in which this little program completed, isn't? In this case 8 seconds is a (very) natural value. Why? Because your FoxPro .prg, as a DBISAM, looks at his data as it were 'locally' (on your C:\ or on a \\<Server>\<Share> doesn't matter...), hence your SCAN traversed _only_ the record headers jumping from pointer to pointer (remember, we just said that we have here a "*Double-linked-list* manager" - each .DBF is in fact a (big) double-linked list with pointers to the next/prior record. So, by issuing SCAN/ENDSCAN (or other navigational commands for that matter) the engine is moving only the record pointer which is blazingly fast (almost no read on HDD), no matter how "wide" your record is. So, 8 secs is really a natural value, imho.
OTOH, Postgres is a SQL DB server and those beasts works only on the 'network' - it really doesn't matter where the client actually is - especially in the Postgres case. (for the record, Firebird has a local protocol which speeds up the things if the client and the server is on the same machine, but Postgres really uses the same approach for any case). So we have all the 'drawbacks' (note the quotes) stated above...
When you opened your pgAdmin perhaps you did a Select * from TestTable; and after this you went to the end of the table (or doing a 'Fetch All' or something similar).
Now the things are entirely different:
the client must 'talk' with its server on a network protocol and here a table scan _issued_by_the_client_ is very very costly because the server is forced to do a NATURAL scan on table and send all the data from the table (which is located to server) to the client, meaning the _all_ the data is read from server's HDD, even in your case the server and the client are located on the same machine.
So, 2 minutes isn't so odd, (it depends also what PC do you have).
Just to compare:
In your test, 100000 recs in 2 mins means ~ 8333 recs/second.
I did a similar test using Firebird 2.0 and a real-world DB (in fact, the first one which I found in my SQL Manager) and 670000 recs were fetched in 40 secs which means = 16750 recs/second. (the DB is stored remotely on a server over a 1Gb LAN - which can slow a bit the test).
Even if it seems that Firebird is _more_than_twice_as_fast_ than Postgres, is ~ 7.5 times _slower_ than your FoxPro (...hehehe... "Nothing runs like a Fox"...)
So why today (almost) everyone goes to SQL servers?
It's all about multi-user. Concurrency control.
FoxPro (and Delphi+BDE) is way better for one-user _navigational_ DB apps.
When you have sensibly more users scattered on a LAN *and*/or WAN (interenet) and you'll start to do 'Select Sum(*), Avg(*), <other 20 fields/expressions here> from <7 jonins here> where <12 conditions here> group by <5 conditions here> ... then you'll see the difference.
Imho, you must change your way of thinking: from navigational to Client/Server SQL. Imho, here is the problem. Even we don't use Postgres, I can humbly advice you that (at least) in your case it has nothing to do with its problems/quirks aso. Imho, the test which you did isn't real. On client/server you'll update records not by:
with t1 do
begin
DisableControls;
First;
While not Eof do
begin
T1Field1.AsInteger:=T1Field1.AsInteger+10;
Next;
end;
EnableControls;
end; //navigational
but rather with UPDATE T1 SET Field1=Field1+10; //Client/Server
I haven't tuned Postgres yet but was wondering if it would be possible to achieve it with Postgres at all. Would moving the database to a Linux server have any effect?
No tunning necessary here. (of course you can increase the page cache but this isn't relevant in your test). Also, because you filled a new db and you did in fact a
'Select * from t1' (which forces a NATURAL scan) updating the index statistics (with ANALYZE and such), vacuuming and other things to 'improve' the DB performance (in fact updating the Index statistics engine to reflect the real status of your data) it will have _no_ effect, since you don't use indexes at all.
DB -> Linux? Well, perhaps, it's up to you to try (we use Win2003 R2 here and we are very satisfied). But _definitely_ much more influence will have the hardware (storage layer - go for a hardware SCSI or SAS RAID controller/SAN if you can afford, CPU and, of course, RAM - as much as you can, but if you work with Postgres AFAIK they don't have a 64bit version yet, so...). But once again, you'll _never_ reach the FoxPro's performance index if you'll going to measure it in the way in which you measured above.
Regards
Sandeep
(as an aside, the Postgres config is a full with very 'verid' values for today hardware, so if you'll use it be sure to ask in their ngs for support in tunning correctly your server).
HTH,
--
m. th.
.
- Follow-Ups:
- Re: Reading million records
- From: John Herbster
- Re: Reading million records
- References:
- Reading million records
- From: Sandeep Chandra
- Re: Reading million records
- From: m. Th.
- Reading million records
- Prev by Date: Re: Evans data on Windows development...
- Next by Date: Re: Reading million records
- Previous by thread: Re: Reading million records
- Next by thread: Re: Reading million records
- Index(es):
Relevant Pages
|