RE: Updating counts
- From: Philip.Garrett@xxxxxxxxxxx (Philip Garrett)
- Date: Fri, 30 Mar 2007 09:44:50 -0400
Hi George,
George Bills wrote:
Sorry if I'm asking a silly question - I'm very inexperienced with
databases. I'm playing around with using DBI / SQLite as a back-end
for parsing and storing log files - I have multiple columns like "log
date", "logged by", etc. I parse these in a Perl script, and then
store them in the SQLite database. I would like to increment a "count"
column if I parse the same log line twice: on the first time I see
that log line, I parse it and store it in its own row, and on
subsequent occasions I increment a "count" column to say that I have
seen that log line "count" amount of times. Some quick Googling has
shown some stuff to do with auto increments, but that seemed to be
much more to do with primary keys and wasn't guaranteed to be in order
/ contiguous. I've also seen MySQL syntax like "ON DUPLICATE KEY
UPDATE ..." which looks good, but I'd like to avoid an extra
dependency on a database that has to be set up and maintained (hence
SQLite).
My preference for a solution would be something that's simple and
maintainable primarily, followed by fast (for reads first, writes
second) - but I don't have the database experience to know what the
best solution is. Is there an easy way?
Yes, but it looks like you've already done it: try fetch, update if
successful, insert otherwise.
Should I be storing duplicate rows and counting them at display time?
Probably not, but it depends on how many rows there are. Grouping lots
of rows to provide a count at display time would be slow. Might as well
get that step out of the way when you load the data.
$dbh->selectall_arrayref("SELECT * FROM $table WHERE line =
?", undef, $_);
Since you don't actually need the data from that row, you might speed
this up a little by changing "SELECT *" to "SELECT 1". Some DBMSes
will optimize that to only check an index instead of looking in the
table itself.
Now for some "generic" optimization hints:
1) Since you'll be calling each of these statements multiple times, you
should prepare() the statements outside the loop, and just execute()
them inside.
2) Turn off AutoCommit. Then, commit() infrequently. In most cases, this
will speed up data loading considerably.
Regards,
Philip
.
- Follow-Ups:
- RE: Updating counts
- From: Ronald Kimball
- RE: Updating counts
- Prev by Date: Re: ANNOUNCE: Apache-Status-DBI-v1.0.0.tar.gz
- Next by Date: Re: DBI compile fails in Solaris 10
- Previous by thread: How do you install oracle 10g instant client on linux/RH ??
- Next by thread: RE: Updating counts
- Index(es):
Relevant Pages
|
|