Re: Updating the SQL key value



Jerry Stuckle wrote:
Toby A Inkster wrote:
Erwin Moller wrote:

Example data in table licences:

GPL GNU General Public Licence http://www.gnu...
FDL GNU Free Documentation Licence http://www.gnu...
CC Creative Commons Licence http://www.cre...

Usage is fairly similar, apart from the fact that now, when you look at
the table 'articles' without doing any joins, you can still infer a bit of
information about which licence each article is under, without having to
inner join onto the licences table.

True, but you also have to look at performance issues. When searching
an index, comparing an int is always faster than comparing a varchar.
And comparing a single column is always faster than comparing multiple
columns. And the index file itself is smaller.

Firstly, I said char, not varchar. char columns are fixed width storage
and consequently a lot faster to search than varchar columns. On a 32-bit
processor, char(4) is particularly fast as it corresponds to the
processor's word length.

Additionally, the PK should not be dependent on data which may change -
i.e. if part of your key was the license (code), what would happen if
they changed the licensing terms?

Firstly, in this example, the licence code is an arbitrary string chosen by
the creator of the data, so it needs to change no more often than an
integer key would.

Secondly, it is perfectly acceptable for a primary key to change. A decent
database can even cascade changes to dependent tables for you.

And before you say that this is a waste of space as char(8) takes up
eight bytes rather than 4 bytes for an integer, you're second-guessing
the database engine there. Database engines really are dead clever.

Not generally, they don't, because of potential problems. For instance,
if the data is stored as a pointer to the license table, the system
has to do an additional file lookup to fetch the license data. And what
happens if the entry from the license table is altered - or worse yet,
deleted? There is no referential integrity built in here. Deleting an
item from the license table would require all other tables which point
to that entry be updated - that is, the varchar data would have to be
reinserted into each row in every table which pointed to the license
table.

Of course there's referential integrity built-in, via foreign key
constraints. If foreign keys are correctly set up, then depending on
how you configured it, one of the following actions would happen when
you tried to delete a licence from the licences table which was being
used in the articles table:

1. the RDBMS would not allow you to delete it;
2. the articles which used that licence would also be deleted; or
3. the articles which used that licence would have their licence
set to the default licence.

It's all about cascading.

And in your case, the data would be stored in 4 bytes ("GPL" + 1 byte
length). So it would take the same 4 bytes - but comparisons would
still be slower.

Greater than / less than comparisons would be slower. But (case sensitive)
equal to / not equal to comparisons should be of equivalent speed, as it's
a simple matter of choosing if one 32-bit value is the same as another
32-bit value -- exactly the same process used if they were integers.
Probably the same CPU instructions used.

That's true. But you're also taking up more storage space and slowing
down searches.

More storage space on, say, the articles table, for storing the user's
login in a column instead of an integer; but less space in the user table,
as there's no need to store an extra integer column.

Slower to perform a join between the article and user tables, but fewer
occasions when you'll need to perform a join, as you have some useful user
information (the login) in the article table already.


On one of my current projects, I've got twelve tables, only one of
which has a surrogate integer primary key (technically it doesn't
autonumber, but I use MAX() to simulate autonumbering when creating a
new record). Guess which table is causing me the most problems?

I can imagine. Using MAX() like this can cause concurrency problems.
That's why RDB designers came out with the auto-numbering columns.

Not heard of transactions?

Better would have been to design my table so that it had a primary key
like (url, revision).

But then your comment table wouldn't point to it anyway if you use (url,
revision). Or, if you just use url, your comment table would be
pointing at two different entries - which is not good foreign key
design.

It would be pointing to a partial key. Which I think is against 4NF or
5NF, but I mostly ignore those two as I've found them of very little
practical use.

Anyhow, it's mostly a matter of taste. I was being somewhat tongue in
cheek when describing surrogate keys as "the root of all evil", but I
can't stand to see a good candidate key go to waste.

Here, I agree. But I tend to lean more towards the performance side.
Design the database with efficiency in mind. That's why all databases
aren't 5NF form.

I tend to err on the side of my own sanity. As far as performance is
concerned, I'm of the opinion that it's premature optimisation, and that
the RDBMS programmers will have already done a much better job than I
would have done.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
.



Relevant Pages

  • Re: Updating the SQL key value
    ... FDL GNU Free Documentation Licence http://www.gnu ... ... comparing an int is always faster than comparing a varchar. ... and good database design would argue against it. ... if the data is stored as a pointer to the license table, ...
    (comp.lang.php)
  • Re: GPL and issues with PHP applications
    ... INAL, but many articles suggest ... If you use a package that is license only as GPL you ... Nobody can force you to adopt a specific licence for code you have written, ...
    (comp.lang.php)
  • Re: SITESWAP DVD - IS IT READY???
    ... The licence is sold from a UK location and therefore is covered by UK ... The license is a contract which states under UK law that you are ... your property) to steal their wallet, but that would be ripping them off. ...
    (rec.juggling)
  • Re: SITESWAP DVD - IS IT READY???
    ... The licence is sold from a UK location and therefore is covered by UK ... The license is a contract which states under UK law that you are ... If you do this without his consent then you are "taking someone else's ...
    (rec.juggling)
  • Re: Latest on CB licenses?
    ... This is the Amateur licence rules, which do not apply to CB. ... Ofcom handle the licenses for both hobbies. ... And if there's no license then there are no rules. ... Many radio based services and equipment types are licence exempt, ...
    (uk.rec.radio.cb)