Re: SQL
- From: "Dmitry A. Kazakov" <mailbox@xxxxxxxxxxxxxxxxx>
- Date: Sun, 5 Feb 2006 15:30:47 +0100
On 4 Feb 2006 15:28:10 -0800, topmind wrote:
Dmitry A. Kazakov wrote:
On 3 Feb 2006 18:28:11 -0800, topmind wrote:
Well, like I already said, I am not sure relational requires ordering
either. If ordering is a pivotable issue for your claim, then let's
explore it further.
You still need:
1. "=", an equality relation (transitive, symmetric, reflexive)
2. Copy constructor, to have an ability to place things into cells
The above is interface of a copyable, comparable type.
Having the ability to be copyable and comparable does not *make* it a
"type", unless you are using a very loose definition of "type". It
indeed may be possible to view everything as a type. But that does not
necessarily make everything actually *be* types.
It is a language issue then. The point is that relational model, you refer
to, can be fully described in terms of types, and therefore represent not
an alternative, but just a case.
Call it internal, that changes nothing. There must be an order to sort. The
order have to be compatible with the equality relation above. But again all
this in not essential.
Perhaps. Sorting is a nice feature when we *do* have ordering to a
given notation/math/domain. However, it does not have to be part of
relational to be provided as a service.
Theoretically yes, but impracticable. This is a serious problem for
container libraries design. [In fact RDB is simply a specialized container]
Nobody wants O(n) containers, but for many types one might wish to have
unordered sets, even if there is no reasonable order. One tries to invent
order (making checksum, or comparing addresses), but an invented order
leads to nasty surprises. I have had a pair of horror stories.
Important is only existence of an interface for a
set of types. So all objects you place in a table must be at least from
this class. And the point is that there are other interfaces, no less
useful, beyond copyable, ordered types.
Again, having a feature and being a thing (such as a class) are
different leaps.
It is a philosophical question. In my subjective idealistic philosophy it
is exactly same.
I agree that ADT's are pretty good for systems software (although sys
soft. is not my specialty, I should note). However, the concepts don't
appear to apply to well to business modeling. An RDBMS engine that may
be designed such that the "expression engine" (for lack of a better
name) can be swapped without changing the relational engine may indeed
use many ADT concepts.
Same for business, same for any other modeling. Don't some biz-models have
something in common, more advanced/specific than just relations? If so,
then those can be extracted in a form of a "biz engine", more specialized
than general "expression engine". ADTs just gives you a tool for doing such
things.
As I have said many times, in the real world the patterns of
differences/changes are not hierarchical. The variations of things tend
to be a semi-random selection of all possible combinations of factors
(Cartesian join). I find sets better able to handle such differences
than hierarchies. "Types" are too tied to the idea of "is-a", when mass
"has-a" management is more appropriate for variation management.
I have answered this. If you can map relations to individual objects rather
than types, do it. Nobody proposes to invent types where unnecessary. A
program with a lesser number of types is easier to understand. The problem
is that quite often this is technically impossible. If you wish to force
everything into the limited set of types, SQL has, you must also accept
much higher developing costs and maintenance beyond anyone's capacity.
For "add" to report "not a number" means that it should know the types of
the operands. This means that you need types at least to check them.
How specificly is "types" different from "validation" in this example?
x = add("123","99.28");
y = add("foo","7");
The language does not care what "foo" is here. Only the "add" function
will care when it checks to see that the first parameter is a valid
number.
The above isn't properly typed. "123" has the type String. Strings aren't
additive.
There are other things it might check such as range because it
may not be able to add large numbers. Types cannot do this very well
unless we either pick arbitrary chunk sizes, or create a type for every
possible length/size, which is dumb.
Types do it perfectly. I can have a wide set of numeric types having
different models. It is a very important issue. Numeric types are models
and there could be many different models of integer, real and other numbers
as found in mathematics. Note that range is only one aspect here. There is
also precision, rounding, accuracy of numeric operations etc. Further typed
systems describe requirements on the type and the compiler/engine is
responsible to fulfill these requirements *automatically*. This way is far
more safe than adding "123" to "99.28", riddling if the result is
"12399.28".
Again, I would like to see a specific scenario of OO outdoing RDBS in a
custom biz app setting (outside of machine performance issues for now).
Technological changes don't happen over night. Then there are serious
issues of foundations and lack of properly typed languages. I can't predict
what would a typical biz application do in 10 years.
I am not here to argue the merits of strong/weak/none typing. (Unless,
perhaps you are one who thinks OO == Types.)
OO <= Types
but also utterly inefficient with respect of
space and time required for may operations.
When I encounter enough of such scenarios,
Put an image in relational table, so that each pixel would be a cell.
Put a program code in a relational table and write compiler in SQL
....
Further, relational does not dictate implementation.
Wrong, it puts definite limits of the implementation.
If I am *searching* based on multiple factors, sure I'd be happy to use
a (good) table browser.
That's the point, you need a different paradigm, because "path" is not a
type in SQL. Because result sets aren't ordered in SQL, etc. Write a GPS
car navigation system representing the results as relations and try to sell
it *anybody*!
Further, I find dynamic or type-free systems more adaptable to multiple
languages and tools. Compile-time-checking tends to assume the whole
world is the same language and gag when it isn't. RDBMS info has proven
more sharable than Java, Eiffle, etc.
It isn't static vs. dynamic. Time of checking is so far irrelevant. As long
as 23 has only one type, there is nothing to check.
The problem of gotos is their power. They have too big "norm".
Mathematically, small program changes may lead to an enormous,
unpredictable effects.
Goto fans might argue that the impact of change is only unpredictable
to those who don't "get" goto's.
There are not so many goto's fans left. Same with RDBMS, they are already
extinct, and coming generations will forget about them.
That is the fate of programming as a whole.
Relational approach suffers it very much. Just consider joins. The beauty
of specialized domain languages is not their power, but, on contrary, lack
of power - there is much less you can do wrong.
Huh? If everybody invents their own join, how does that reign in
problems?
Should any problem be solved in terms of joins? You are trying to sell me a
wrong tool!
OO as a paradigm and ADT as its vehicle tries to keep the power, but also
provide checks and balances to diminish negative impact of exercising that
power. In particular, discipline in OO is enforced on the component basis.
Show with code.
But you said that you aren't interested in static program correctness
analysis.
If you want a challenge, fine. Take any machine learning method. Training
sets are ideal tables, rows and columns, nothing else. Take any method of
your choice and implement it in SQL! For introduction to existing methods,
see excellent tutorials by Andrew Moor:
http://www.autonlab.org/tutorials/
You have *pure* relational data. Note that whole machine learning is
nothing but just SELECT training_set WHEN example=x. So, don't hesitate,
give me a Support Vector Machine in SQL!
Outside of specific languages or implementation, the two biggest
differences between relational and OO are:
1. Each "record/map" must belong to one and only one entity (table) in
relational. OO's "map" has no such restriction. Inheritance can emulate
such, but that is optional. Each object can float independently.
Do you refer to singletons here? I don't see why this should be essential,
but it is no problem to enforce that in an OOPL. Make constructors private,
if you want, and here you are. But as a principle, it is wrong - numbers,
strings aren't bound by this rule. 123 can be in any number of cells. Try
to consider a wider picture: there are things in cells, cells themselves,
rows, columns, tables, sets of tables, sets of sets of tables etc. ADT
offers you a unified way to handle that all.
No, it does not. ADT by itself is not a language and relational is a
bigger picture than ADT's.
Come on! SQL can be trivially described in ADT's. Try the opposite. Just
place a table into another table, flip a table column, describe a ring in
SQL, write a task scheduler in SQL...
2. Relational generally assumes a partitioning between data and
behavior, while OO tries to meld them.
Actually it tries to get rid of data. It says that there is no data, but
only behavior. The rationale is as follows. You cannot perceive data, only
the behavior of those. This is in full accordance with mathematics. There
is no such thing as number 123. There is a set of properties it and similar
things expose. Moreover, try to ask yourself what is a relation, and you
will see that a pure relational approach should care about data even less.
Well, data and behavior are just different views of the same thing.
Can you name this thing? Again, what is a relation? Formulate it, and point
me the word "data".
This would get into a definition battle that has no hard math to say
yeah or neah.
One thing is quite clear, it is impossible even to define the term "data",
without description of behavior. Look at mathematical definitions of
numbers.
"Data" is a set of computation states, characterized by definite behavior.
When it is said that a table cell contains 1, it means all states where Get
(or SELECT) would deliver result associated with the application domain
object denoted as 1.
It is the behavioral approach, which makes both OO and your beloved
relational model *implementation-independent*. Otherwise, you were unable
even to talk about "data", because, again, what in common have two states
of magnetic fields on the hard drives of two computers? They expose same
*behavior*, which you call "data"! Is big-endian and low-endian encoded
26732147 same "data"?
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
.
- Follow-Ups:
- Re: SQL
- From: topmind
- Re: SQL
- References:
- Prev by Date: Re: Simple inheritence question
- Next by Date: Re: SQL
- Previous by thread: Re: SQL
- Next by thread: Re: SQL
- Index(es):
Relevant Pages
|