Re: SQL
- From: "topmind" <topmind@xxxxxxxxxxxxxxxx>
- Date: 4 Feb 2006 15:28:10 -0800
Dmitry A. Kazakov wrote:
On 3 Feb 2006 18:28:11 -0800, topmind wrote:
Dmitry A. Kazakov wrote:
On 2 Feb 2006 17:13:49 -0800, topmind wrote:
The "expression engine" can be logically seperated from a relational
engine it appears to me. As long as the expression engine supports
equality comparisons (less than, equal, etc.),
This is a definition of a class: the set of types, each of them has "<",
"=". Further one might wish "<" to be order relation and "=" be equality
relation. You can argue that all objects should be of this type, but this
is mathematically unsound. For example, tables themselves, as objects,
aren't ordered. Complex numbers aren't ordered, etc. There also might be
incomparable objects.
Well, like I already said, I am not sure relational requires ordering
either. If ordering is a pivotable issue for your claim, then let's
explore it further.
You still need:
1. "=", an equality relation (transitive, symmetric, reflexive)
2. Copy constructor, to have an ability to place things into cells
The above is interface of a copyable, comparable type.
Having the ability to be copyable and comparable does not *make* it a
"type", unless you are using a very loose definition of "type". It
indeed may be possible to view everything as a type. But that does not
necessarily make everything actually *be* types.
then relational can
operate on it. (And possibly only just equal or not equal are needed.
Need more pondering on this.)
You need order for sorting, otherwise everything will become O(n).
Sorting is a (nice) convenience of most RDBMS, but its not required for
relational itself. (I had to face this issue when I designed the draft
for SMEQL, my pet query language. I basically made it an "internal"
operation.)
Call it internal, that changes nothing. There must be an order to sort. The
order have to be compatible with the equality relation above. But again all
this in not essential.
Perhaps. Sorting is a nice feature when we *do* have ordering to a
given notation/math/domain. However, it does not have to be part of
relational to be provided as a service.
Important is only existence of an interface for a
set of types. So all objects you place in a table must be at least from
this class. And the point is that there are other interfaces, no less
useful, beyond copyable, ordered types.
Again, having a feature and being a thing (such as a class) are
different leaps.
Thus, if somebody wanted to include imaginary numbers in a relational
query language, in theory they could without busting relational. It
demands very little of the underlying expression model.
Thank you for explaining why ADT is so great! (:-)) You are right, with ADT
one just need to implement the interface of a
thing-that-can-be-in-a-relational-table and here it is, the engine works!
I agree that ADT's are pretty good for systems software (although sys
soft. is not my specialty, I should note). However, the concepts don't
appear to apply to well to business modeling. An RDBMS engine that may
be designed such that the "expression engine" (for lack of a better
name) can be swapped without changing the relational engine may indeed
use many ADT concepts.
Same for business, same for any other modeling. Don't some biz-models have
something in common, more advanced/specific than just relations? If so,
then those can be extracted in a form of a "biz engine", more specialized
than general "expression engine". ADTs just gives you a tool for doing such
things.
As I have said many times, in the real world the patterns of
differences/changes are not hierarchical. The variations of things tend
to be a semi-random selection of all possible combinations of factors
(Cartesian join). I find sets better able to handle such differences
than hierarchies. "Types" are too tied to the idea of "is-a", when mass
"has-a" management is more appropriate for variation management.
How else, you can extend the DB for dealing with something new without
changing the engine?
You probably meant not new types, but new classes, new interfaces
additional to a thing-that-can-be-in-a-relational-table. But I don't see
why this should be any advantage.
Note that even for things you can put in
a table, it makes a lot of sense to refine interfaces to have:
integer-number-in-a-table with specific operations to sum a column, and
string-in-a-table, and so on.
I am not following this. One does not need to distinquish between
numbers and strings (internally) to have a language or expression. A
fair amount of dynamic languages are like this. The language does not
care about the difference between a number and a string, treating
everything (or at least every scalar) as pretty much a string. An "add"
function might care, but that is a content validation issue specific to
*it*, not the base language. It may give an "operand not a valid
number" error if it gets a funny string. The base language has no
concept of types for numbers versus strings, only the library
operations may care.
For "add" to report "not a number" means that it should know the types of
the operands. This means that you need types at least to check them.
How specificly is "types" different from "validation" in this example?
x = add("123","99.28");
y = add("foo","7");
The language does not care what "foo" is here. Only the "add" function
will care when it checks to see that the first parameter is a valid
number. There are other things it might check such as range because it
may not be able to add large numbers. Types cannot do this very well
unless we either pick arbitrary chunk sizes, or create a type for every
possible length/size, which is dumb.
I
leave the issues of safety aside. But even if you enjoy customer calls late
night, nevertheless, the design above is flawed. There is not only the
types of the operands. What is about the type of result? (example: time -
time yields duration, but time!) What about summation of rows? What about
adding distances to luminous flux? What about different kinds of "add"?
(example: scalar multiplication vs. vector one, or, better, I'm sure, you
are aware that floating-point arithmetic is unacceptable for money
calculations, and fixed-point arithmetic for numeric analysis and
scientific calculations.) Can I add a column to column? Can I add tables?
Does this "add" have an inverse operation. When? Any "add" has inverse? Any
type (sic!) has 0, -1? What is -1 of a column? [There is no universal
arithmetic for all objects of any type (sic!) => there cannot be any
universal engine to implement it.]
All these issues are handled by the type system, and there is absolutely no
contradiction to relational model.
I agree they are generally orthogonal. The rules of the "expression
engine" don't have to be relational's care.
Again, I would like to see a specific scenario of OO outdoing RDBS in a
custom biz app setting (outside of machine performance issues for now).
Technological changes don't happen over night. Then there are serious
issues of foundations and lack of properly typed languages. I can't predict
what would a typical biz application do in 10 years.
I am not here to argue the merits of strong/weak/none typing. (Unless,
perhaps you are one who thinks OO == Types.)
Relational does not dictate how a graph "looks". That is a display
issue. However, I hardly see how OO is an improvement.
Mathematically graph is a relation, but its representation as a table is
not only unnatural for humans,
It depends. With a good table-browser and query system, tables are
pretty good at navigating such info once you get used to it. I agree it
takes practice to work effectively with table browsing and querying,
but the best techniques in the longer run are sometimes not the easiest
to start with.
The problem with "box and arrow" type of graphical displays is that it
has proven difficult to display link info and node (box) details at the
same time. And, if you have more than 3 or so factors it becomes too
difficult to turn into a 2D display. Many things have dozens of factor
spaces associated with them. The human visual system cannot handle more
than 3 dimensions and thus is inappropriate for large factor space.
but also utterly inefficient with respect of
space and time required for may operations.
When I encounter enough of such scenarios, I'll stop using it. So far,
not. Further, relational does not dictate implementation. A relational
engine can probably constructed for a specific purpose such that we
don't have to learn a new paradigm or interface.
"Unatural" for *which* humans? I've found over the years that
everybody's head works differently.
Well, next time visiting subway, take a look at its map on the wall and ask
yourself if you really would like to have it depicted as an incidence
matrix of stations...
If I am *searching* based on multiple factors, sure I'd be happy to use
a (good) table browser. Physical maps show location, but not factors.
Nor does it preclude the use of physical maps when location info is
desired.
OO model offers a
choice which IMO by no means excludes relational model.
Well, that is part of the problem: Relational introduces discipline,
conventions, and integrity. Navigational approaches perhaps are
(arguably) a super-set, but that is why navignl. is so messy: NO
DISCIPLINE. It is roughly analogous to Structured (nested blocks)
programming versus GO TO's, where navigational is the Goto of
structuring. Or as l like to say, "shanty town".
(Nobody can objectively prove goto's worse or better also. Most
arguments come down to human psychology. "Consistency" is perhaps an
external metric, but tough to measure.)
Ah, but types systems are far better with these issues. In SQL any integer
is integer. The only difference is semantically irrelevant number of
digits. In ADT I can have number of customers and number bug reports
ensuring, that no any program would ever mix them.
Again, I don't want to get into a dynamic-vs-static typing debate.
There are plenty of them already at c2 and usenet. It appears to be a
personal preference or domain-specific decision. I prefer dynamic or
type-free typing systems for the domain I work in. Flexibility and
adaptability is more important than having the compiler check as much
as possible up-front in my observation. Static domains are moving
overseas where cheaper labor makes it more cost effective to spend
effort on the verbosity and nit-pickiness needed by strong
compiler-time checking languages. The comparative advantage of the US
is in trend-hopping, and static types don't work well there.
Further, I find dynamic or type-free systems more adaptable to multiple
languages and tools. Compile-time-checking tends to assume the whole
world is the same language and gag when it isn't. RDBMS info has proven
more sharable than Java, Eiffle, etc.
The problem of gotos is their power. They have too big "norm".
Mathematically, small program changes may lead to an enormous,
unpredictable effects.
Goto fans might argue that the impact of change is only unpredictable
to those who don't "get" goto's. Plus, I have not seen any objective
proof of that, regardless of what my personal preferences are. I have
seen messy "structured" code also. It is hard to provide an objective
test to know if the problem is with the coder or the goto.
That is the fate of programming as a whole.
Relational approach suffers it very much. Just consider joins. The beauty
of specialized domain languages is not their power, but, on contrary, lack
of power - there is much less you can do wrong.
Huh? If everybody invents their own join, how does that reign in
problems? Much of the stuff that you think is domain-specific is really
just "database verbs" in disquise I bet.
OO as a paradigm and ADT as its vehicle tries to keep the power, but also
provide checks and balances to diminish negative impact of exercising that
power. In particular, discipline in OO is enforced on the component basis.
Show with code.
Outside of specific languages or implementation, the two biggest
differences between relational and OO are:
1. Each "record/map" must belong to one and only one entity (table) in
relational. OO's "map" has no such restriction. Inheritance can emulate
such, but that is optional. Each object can float independently.
Do you refer to singletons here? I don't see why this should be essential,
but it is no problem to enforce that in an OOPL. Make constructors private,
if you want, and here you are. But as a principle, it is wrong - numbers,
strings aren't bound by this rule. 123 can be in any number of cells. Try
to consider a wider picture: there are things in cells, cells themselves,
rows, columns, tables, sets of tables, sets of sets of tables etc. ADT
offers you a unified way to handle that all.
No, it does not. ADT by itself is not a language and relational is a
bigger picture than ADT's.
2. Relational generally assumes a partitioning between data and
behavior, while OO tries to meld them.
Actually it tries to get rid of data. It says that there is no data, but
only behavior. The rationale is as follows. You cannot perceive data, only
the behavior of those. This is in full accordance with mathematics. There
is no such thing as number 123. There is a set of properties it and similar
things expose. Moreover, try to ask yourself what is a relation, and you
will see that a pure relational approach should care about data even less.
Well, data and behavior are just different views of the same thing.
This would get into a definition battle that has no hard math to say
yeah or neah.
Relational provides fairly consistent rules and conventions for
organizing boatloads of stuff. ADT's don't. In particular they don't
provide any disciplined or consistent way to link stuff, focusing on
thing-at-a-time instead of the bigger pool of stuff.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
-T-
.
- Follow-Ups:
- Re: SQL
- From: Dmitry A. Kazakov
- Re: SQL
- References:
- Prev by Date: Re: Simple inheritence question
- Next by Date: Re: Simple inheritence question
- Previous by thread: Re: SQL
- Next by thread: Re: SQL
- Index(es):
Relevant Pages
|