Re: POD speed
From: Lee Fesperman (firstsql_at_ix.netcom.com)
Date: 06/23/04
- Next message: Fabian Rossbacher: "Access to OLEDB 4.0"
- Previous message: Roedy Green: "Re: Foxpro database connection"
- In reply to: Chris Smith: "Re: POD speed"
- Next in thread: Chris Smith: "Re: POD speed"
- Reply: Chris Smith: "Re: POD speed"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 23 Jun 2004 08:04:57 GMT
Chris Smith wrote:
>
> Lee Fesperman wrote:
> > You could claim (without evidence) that a majority would code it your
> > way, or even a vast majority, but that would still leave 'some'
> > minority which will have mistakes. You can't guarantee that a proper
> > implementation will be used. OTOH, SQL's declarative syntax is
> > guaranteed not to have the problems I listed.
>
> Well, "guaranteed" is an interesting word. I think you mean guaranteed
> if someone does it right and the constraints can be expressed
> declaratively, but then you're in the same boat.
It's most certainly not the same boat. I listed reasons that enforcement of database
constraints in application code is error-prone compared to enforcement by the DBMS using
declarative constaints.
You've asserted that *all* application code in this area uses proper encapsulation and
is 100% error-free. I've merely asserted that this is surely true of the RDBMS.
> .... I wonder how many data
> integrity checks can be done in Java, but can't be done in a database
> declarative constraint. For example, let's say that an integer value is
> supposed to be guaranteed to have neutral disparity -- which means an
> equal number of 1 and 0 bits in the binary representation. Can you
> write a CHECK constraint to ensure that this remains the case? (I don't
> know that you can't, but I'd be interested to find out.)
The constraints you've been describing are 'domain' constraints. This is the one place
in RM where procedural (even OO) code can be appropriate -- constructing user domains.
Domain constraints should be implemented by the domain itself.
Other types of constraints -- row constraints, multi-table constraints are harder to
implement outside the DBMS.
> I frequently see people asking about implementing certain constraints on
> data, and being told to use a trigger to write pseudo-procedural code,
> because a declarative constraint can't do what they want. Of course,
> many more people in similar situations will just leave out the
> constraint and forget about it altogether.
The need for triggers is symptomatic of the weakness of SQL, which is rather poor in
following relational principles. Procedural constructs like triggers, stored procedures
and subqueries are needed to shore up SQL. However, they do have the advantage of being
executed by and on the DBMS (encapsulation) and are generally short, self-contained,
functional-style snippets of procedural code (easier to get right.)
I don't doubt that many needed constraints are poorly enforced or simply ignored.
Unfortunately, the effects of mangled databases are much worse than that of buggy
applications. One of the worst offenders is the popular technique of selective
denormalization to achieve performance. When you denormalize, you need to add
constraints to compensate for the resultant weakening of the database structure. Since
the additional constraints will often wipe out the performance gains, they are simply
ignored.
> It would be interesting, if only it were possible, to find out how often
> constraints on data are typically enforced in databases versus
> application code. I don't know and neither do you, of course; but I'm
> pretty sure that despite your "guarantee", the answer would not be 100%.
> You're willing to reject encapsulation in an OO model because people
> might not use it, despite its being universally taught in every
> programming class (professional, college, or whatever) I've seen as one
> of the most important aspects of doing OO programming; but you're
> willing to assume that everyone will go out of their way to make sure
> they are checking all the data in their database? This, again, flies in
> the face of my experience.
Here you've moved outside of our subtopic -- from a single application to shared data.
When multiple applications share the db, it becomes even more important that constraints
are enforced by the DBMS.
I think you are wildly optimistic about the use of OO techniques. Just because someone
took a college source or attended your training doesn't mean they fully comprehend
encapsulation and will apply it effectively. In the real world of commercial
programming, things are not so neat. Impossible deadlines, unreasonable bosses who
insist on instant results (have you ever read Dilbert?) and other demands/distractions
work against well-written code.
> > > Sure, that's true. If such a tool is used for data entry or
> > > modification, though, then the data is no longer application-private. I
> > > could imagine using such a tool for testing and still considering the
> > > data app-private, but then I don't care about data integrity, because
> > > I'm just testing. If I get something wrong, a test may fail, and then
> > > I'll find it and fix it.
> >
> > You expound on an uninteresting case. The real issue is when such a tool
> > is used on a production database. Note that the tool might not even exist
> > when the application is written.
>
> Here you assume that the customer needs a full-fledged database, and
> then argue that this lightweight solution is inappropriate because it
> fails when used in that way. Not everyone wants to manage data outside
> of the application that uses it. The desire to do so, I agree, is a
> definite indicator that it's time to put that data into a relational
> database.
Actually, I'm assuming that the tool exists or could exist for something like POD.
> All I'm doing here is clarifying what I mean by application-private
> data. I find that such data exists (and, in fact, is often much more
> common than shared data), and part of its definition is that it does
> *not* need to be managed outside of the application.
Applications can have bugs, incomplete coverage of certain situations or just
'perceived' bugs. These will tempt the end-user to use a general tool to 'fixup' the
database.
> > Your claim is not supportable. OO programming is based entirely on
> > programming artifacts invented for programming convenience and not
> > for modeling the real world.
>
> I see that as a false dichotomy. Modeling real-world concepts (i.e.,
> the concepts that applications primarily deal with) is exactly what
> provides programming convenience. I didn't say OO programming isn't
> intended to be convenient, but rather that it's appropriate *because*
> it's convenient. Or maybe I'm misinterpreting what you mean by
> programming convenience?
OO has no data model. There no rules for data structure --- for which class a given data
field should be placed in.
OO is simply ad-hoc. IOW, its ability to model the real world is entirely dependent on
the skills of the individual programmer.
> > Yes, Date proves that the OO model is ad-hoc thus error-prone.
>
> Which is, of course, overstating the point and not even relevant to this
> piece of the discussion.
Not relevant? I was replying to your statement that Date attempts to prove that OO is
less rigorous than RM. Of course it is obvious that OO is hardly rigorous. However, you
also skipped Date's examination of OO's weakness in modeling the real-world.
> > Persistent data always needs to be consistent across the database.
> > Dynamic data does not always need to be consistent with persistent
> > data or within the application.
>
> Unfortunately, what you mean by "consistent" is left ambiguous here. I
> don't doubt that you know exactly what you mean, but there are at least
> half a dozen things you could mean by that word.
Not ambiguous at all. Consistency is a basic database concept. You don't know that?
> (Incidentally, in case you just don't like PostgreSQL, we also tested
> this with Oracle and duplicated the same performance issues with large
> values in fields, which also matches the expectations shared by plenty
> of others who had tried similar things. Or maybe Oracle isn't an
> industrial strength RDBMS either? What is?)
You were vague about what the 'trouble' was. I assumed you meant that it just didn't
work, rather than the performance issues that you now mention. There are quite a number
of solutions to performance issues. I'll assume that you found none that were feasible.
> > For example, proving procedural code is correct is so hard that
> > it is rarely attempted.
>
> Clearly, proving something to be correct is a different matter than
> avoiding errors in practice. Few things can be proven correct,
> especially when there are human factors involved.
I wasn't talking about 'things' in general or human factors; I was referring to
programming languages. Certain non-procedural languages are amenable to correctness
proofs.
-- Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
- Next message: Fabian Rossbacher: "Access to OLEDB 4.0"
- Previous message: Roedy Green: "Re: Foxpro database connection"
- In reply to: Chris Smith: "Re: POD speed"
- Next in thread: Chris Smith: "Re: POD speed"
- Reply: Chris Smith: "Re: POD speed"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|