Re: SQL



(part 2 of reply)

> >> often through an object-relational mapping tool (Apple's 'Enterprise
> >> Objects', part of their WebObjects product). This allows me to use the key
> >> strengths of RDMS:es - persistence, organization of data, and through SQL
> >> a relatively standard interface - while also allowing me to use the
> >> strengths of OO, for implementing the business logic for those objects
> >> that are stored in the database.
> >
> >Strength of OO? You are talking to the wrong guy. oop.ismad.com
>
> Then why are you talking in the comp.object newsgroup?

There is no comp.anti-object, so stuff related to OO, including
criticism, goes into the same topic. This is how it works.


>
> >> But make no mistake: The database is a
> >> _useful_ part for storing the data, but it isn't a _necessary_ part. Other
> >> persistence mechanisms could be used.
> >
> >OO'ers tend to use very little of what DB's can offer. They like to
> >reinvent the other items on that list outside of "persistence" just
> >because they can. If you want to have fun, look at how OO apps tend to
> >implement many-to-many relationships.
>
> OO'ers tend to use very little of what databases have to offer, because
> they tend to think that the other things that databases these days offer
> (stored procedures, triggers, extensions in all sorts of directions)
> should be handled in other parts of the system than in the database,
> because OO suggests a certain organization of code - with objects carrying
> data and methods to operate on that data - and from such a standpoint, the
> main use for a database is indeed for persistence, rather than for any of
> the extensions that are cheerfully added onto them to bring them out of
> the 'just storing data' niche.
>
> To-many relationships are generally represented, in OO code, as an object
> holding a collection of the objects it is related to; if the relationship
> is many-to-many, then the related object will in turn have a collection of
> objects that it is related to, and the first object will be in the second
> objec's collection just like the second ons is in the first's.
>
> Yes, this means that in order to make certain that the relationships are
> always many-to-many, i.e., that no object is ever only on one end of such
> a relationship, you need to update two collections (that of the source
> object, and that of the destination object) whenever you create or
> destroy sich a relationship. However, this is just a direct consequence of
> the fact that each object knows about the objects that related to it
> (i.e., for a 'friends'' relationship, each object keeps track of the
> object that it considers to be its friends); and if such a relationship is
> always supposed to be reciprocal (i.e., either both peopel consider each
> other their friends, or neither of them does), then both objects need to
> be told of any changes to this status (so, if one object decides that 'you
> are not my friend anymore', then the other object also must remove the
> first object from its 'friends' list).
>
> Now, in a relational database, a one-to-many relationship is represented
> by a foreign key in the _related_ object, identifying the _source_ object:
> I.e., each row keeps track of which other row (singular) is related to it.
> So, if you were representing a tree structure, each 'child' row would hold
> a foreign key to its 'parent' row; whereas the 'parent' rows would not
> explicitly hold any information about their children. To find out what
> rows are the children of any given parent, a join has to be performed
> (which is cool, because databases are built aroudn that concept); but it
> still leaves the, from some points of view slightly strange, situation
> that in one direction the information is immediately there (each child row
> has a reference to its parent row), whereas in the other direction, the
> information must be deduced (by, essentially, asking all the possible rows
> 'do you consider me to be your parent?').

Those kinds of things can be wrapped in functions or stored procedures
such that one does not have to care about how the back-links work. And,
one does not need to duplicate information (two-way links) to do it. On
a large scale, such duplication can create nastly concurrency and
integrity problems.

>
> And many-to-many relationships are represented _only_ by indirect
> information - a join table; which is a table that holds _only_ foreign
> keys. In this case, neither one of the tables that are related to one
> another, actually hold the information about which other rows thay are
> related to, themselves; only the join table holds this information. each
> row in the join table relates one source and destination row; so in order
> to traverse the relationship, two joins are required, one to join the
> source table and the join table, and one to join that result with the
> destination table. This, too, could be considered 'weird', because once
> more, the information about which other rows a certain row is related to,
> is not stored intrinsically along with the row itself, but extrinsically
> in a different table; except now, this is the same for both the source and
> destination.
>
> Of course, most people have no great problems wrapping their heads around
> both of these approaches, and don't consider either one 'weird', but
> simply a consequence of the general environment in which they occur. When
> you have a system of objects which have unique references, and you can
> easily refer from one object directly to another (and there are ways to
> collect references of objects together), the OO way comes out as being
> quite straightforward. Even without objects, per se, but 'only' records
> and arrays or sets, such organization makes quite a lot of sense.
> Similarly, in the relational data model, where rows have primary keys by
> which they can be identified, but where rpws also have fixed columns, and
> where join operations are the fundamental way of putting things together,
> using foreign keys and join tables is the straightforward, simple,
> effective solution. it is only if you view either solution from a
> viewpoint that does't take into account its environment, that things look
> weird. And in the end, both approaches are isomorphous and can be
> translated between automatically: whether you use pointers/object
> references on the one hand or joins on the other, both represent
> relationships between 'things', and both have the same power of
> expressiveness.


But OO tends to be inconsistent in how it does this. Different
languages and different designers all do it very differently than each
other. The relational rules of normalization allow less "creativity".
Specific views or functions can be created to serve app-specific needs,
but for maintaining the info itself, it is all still guarded/guided by
the relational normalization. Thus, relational designs look more like a
planned city while OO designs look like a hack-as-needed shanty town
with telephone and TV wires every which way depending on who made what
deal that week.


> >> >> The relational data model is specifically intended for data storage and
> >> >> access; it isn't intended to address anything beyond that.
> >> >
> >> >If that was true, then they would be happy with a file system alone.
> >> >Besides, original intentions can mean squat, per above.
> >>
> >> What precisely besides data storage and access is the relational model
> >> about? You should take note that I wasn't referring to persistence above.
> >> It's certainly not about computations or business logic or user
> >> interaction, for instance ...
> >
> >What do you mean "not about business logic"? I've seen complex IF
> >statements that could be reduced to table attributes.
>
> In that case, the business logic is represented as code in a language,
> which happens to be stored in the database; it still has to be read from
> the database and executed. The database is simply used as a mechanism to
> _store_ the code, not to _execute it_.

Yes, but the person putting the info in does not have to really worry
about that step. Something like Prolog also requires something to
actually "run" it, but the programmer is using mostly logical
deductions, not being concerned with "run". A DB-centric design is in
many ways similar.

>
> >Sure, something has to "run" those indicators,
>
> Exactly.

But a DB-centric solution tends to consider that an implementation
detail. One is more like a general who draws up the plans instead of
being the front-line soldier who actually carries out the orders from
the general. The general says where the troops go, which attack plan
they use, and when to use it. Then he shouts "go!" and soldiers follow
those plans. (Sure, this is an oversimplification, but you get the
idea).

Execution is a necessary step, but not necessary one to get all cought
up in.

>
> >but the runner is just like a player
> >piano: it does not care what the actual notes are in the paper roll, it
> >just "executes" them.
>
> But that analogy fails: the indentations on the roll for a player piano
> may not 'know' which notes they represent, but the player piano _does_
> know: for a specific indentation, it will play a specific note. In a
> player piano, that knowledge is encoded in the piano's hardware (the
> linkage between the parts that read the paper roll, and the parts that
> actually play the notes), but it is definitely there. The paper roll is
> useless without the knowledge about which indentation causes which note to
> be played on the piano.

True, but from the tune-writer's standpoint they DON'T HAVE TO CARE.
The composer does not have to give a flying [bleep] about how the gears
work inside the player piano.

That's the point. Necessary, yes, but not something that has to be the
primary concern. Being necessary and being the primary concern are
different issues.

>
> There has to be something that interprets
>
> >The knowledge in your brain can be represented as
> >attributes in neuron tables.
> >
> > table: Links
> > =================
> > sourceNode_ID
> > destinationNode_ID
> > weight // weighting factor, can be negative in some models
> >
> > table: Node
> > ===============
> > node_ID
> > activationFuncIndicator // (unit_step, sigmoid, piecewise_linear,
> >gaussian, and identity)
> > activationWeight // the "volume" given to activation function
>
> But something needs to know what to do with all this data. This is going
> to be something other than the database.

True, but that is not where the real power lies. The "execution"
process of Einstein's brain and Six-Pack Bob's brain are probably
nearly identical. It is the attributes and links between nodes that
make the difference, not the executor.

>
> >Now, whether all "IF" statements should be turned into table attributes
> >and set logic is a separate issue. The fact is it can be and is done to
> >various levels.
>
> Again, that's just using the database for _storage_ of the program.
> Something else still has to interpret the data in the databae, and use it;
> without that knowledge, the database, the schema, the data in it, are all
> completely useless. You could replace the database with a flat file, or an
> in-memory data structure, with no difference in what you're saying.
> There's a difference between soemthing that has been written down, and
> soemthing that has been written down _and the ability to interpret it_.

It is a matter of diminishing what we as the developers are concerned
with. Again, execution may be necessary, but does not have to be our
*primary* focus.

>
> >What I put in code and what I put in tables is mostly a decision about
> >what is most convenient, not what is possible. They are generally
> >*interchangable*. (Putting 99% of the app in tables is generally
> >creating a table-oriented interpreter for the most part.)
>
> Exactly.

But it can be an application-specific interpreter: a special-purpose
language of sorts so that we are dealing with domain-specific
abstractions.

>
> >Generally I use program code for Boolean statements. Boolean statements
> >are usually combersome to represent and manage in tables.
>
> ... which just goes to show that source code is often the most convenient
> representation for a programming language.

The point is to demininish the need for such languages, only using them
for what they do best instead of for everything.

>
> >> But it shows that you cannot use a relational database alone as a
> >> replacement for text files: You also need some form of parser to separate
> >> the text file into parts that you can put into the relational database.
> >
> >So? Maybe that kind of parser is not something best done with a DB.
> >However, it is perhaps prudent to store a complex grammar in a DB.
>
> But again, this is just using the database _for storage_. Which is fine,.
> of course, but again, you are just giving examples of using a database for
> _storing data_, which is what they should be used for.

It is modeling, not just "storage". Otherwise, files would be used.

>
> >> >> And I suspect that the code that would read an arbitrary stream of
> >> >> characters and puts it into your tables, was written in something _other_
> >> >> than SQL - because SQL is not good at the free-text processing that is
> >> >> necessary to get the data into a tabular format, because SQL is not
> >> >> general purpose.
> >> >
> >> >You mean not Turing Complete, per above.
> >>
> >> Not necessarily - please don't put words into my mouth. The point is that
> >> SQL was never intended to work with arbitrary streams of characters - it
> >> was intended to work with arbitrary tables (and rows and columns).
> >
> >But one *can* represent text as rows and columns. Maybe it is not the
> >most convenient approach in some cases, I will agree.
>
> One can represent _almost any_ form of data in tables of rows and columns
> - that's what the relational data model is about. But one can _not_ take
> an arbitrary set of data and let a database magically read it - because
> the organization of the data has to be done by something that knows how
> the data can be broken into the appropriate parts, as well as, of course,
> there needs to be something that describes what these appropriate parts
> are, how they are related to ne another, etc - you know, the database
> schema.

Objects don't read minds either.

>
> The point is that just because you have some set of data, and a database,
> you don't have a complete solution. You need to organize the data, and
> format the data into that organization. This requires things that the
> relational database model does not offer - but turing-complete languages
> do. Even if you were to write a parser whose rules you store in teh
> database, it's not the relational database doing the parsing, it's 'the
> parser' which will be a construct which is written, fundamentally, in a
> general-purpose programming language, even though it may be driven by data
> which is stored in tables in a database.

We've been over this already. One diminishes the need to deal with that
concern.

>
> OK, show us some use cases that support your assertions, then - that show
> that it is even _possible_. Show how a 'general-purpose string
> matching/finding tool' is a 'general-purpose tool' _outside_ of its domain
> - which is finding predefined patters within strings.

That is an interesting challenge that I am not up to. I've encountered
programmers who claimed they could do such and they gave some
interesting examples of what can be done. However, I don't have the
link right now. I suggest you try a Perl forum perhaps. Generally they
can use strings to represent stacks, queues, lists, passible
"functions" (code snippets that are later Eval'ed), etc. and regex's to
parse and unparse such strings.

>
> >> The point here isn't turing-completeness, actually: it's that some things
> >> only apply to _specific_ domains (such as regular expressions, which apply
> >> to the domain of strings;
> >
> >I am not sure if I would call string handling a "domain" in the sense
> >it is usually used. Strings are a general-purpose tool also, BTW.
>
> In what sense are strings a 'general-purpose tool'? They too don't _do_
> anything - they are simply a set of data (much like a relational
> database). One can certainly use a string as a description of a program
> (hey! source code stored in a text file!), but the string itself is just a
> string.

DAhhh! Not this again. You are obsessed with the "final button".

>
> >How is seeing most
> >things as objects worse than seeing most things as tables?
>
> Tables are static things, that just are; objects are dynamic things, that
> both are _and do_ things. As are functions and values in the functional
> paradigm, as are data astructures and procedures in a procedural paradigm,
> etc.

Please clarify "static".

[snipped yet more "final button" mantra]

> That, of course, is the reason why
> database vendors started adding procedural languages as extensions to
> their database enginesm even though those have nothing to do with the
> relational model per se.

They did it partly to lock in users to their product, and partly to
allow the same behavior *regardless* of app language. Thus, a database
trigger will execute when an update happens whether the app that did
the updating is Java, Python, Fortran, or Brainf*ck. In this sense, it
is a better "gatekeeper" than OO's encapsulation.

OO cannot so easily provide that. There is currently more knowledge in
the world about how to make "data oriented interfaces" more sharable.

Consider a graphing utility that makes pie charts, bar charts, etc. A
data-oriented interface would have one fill in tables with the various
attributes such as graph type (bar, pie, etc.), axis titles,
groups/lines, and then actual data (or a query to it). We would have at
least the graph table, the line/group table, and the data table (if
query not used).

One fills in the info, and then executes a "Run" or "Draw" step to draw
the chart. I agree that the DB itself may not actually execute the
graph and draw it. However, the attribute filler-upper does not have to
really care about how it is actually implemenented. The "run" step
implementation is not their concern.

An OO version of the same thing would generally resemble:

g = new graph(graph.context.PieChart());
g.addGroup("North", "Red");
g.addGroup("Middle", "Green");
g.addGroup("South", "Blue");
g.addData(.....);
.....
g.draw(); // execute it

Most of the "behavior" is simply defining the thing; that is, setting
attributes. Running it ("draw") is only about 2% of the whole code
deal.


> >You haven't shown how OO better handles behavior anyhow outside of
> >shape, animal, and device driver examples.
>
> This isn't specific to OO: the relational model is simply not a
> general-purpose tool outside its domain (managing data in tables); it
> simply can't handle anything outside that. Specifically, the relational
> model doesn't handle _behaviour_ _at_ _all_. The OO model _does_ handle
> behaviour. Even if it were to handle behaviour _badly_, that still means
> that it 'wins' over the relational model, because, as I mentioned, the
> relational model doesn't handle behaviour _at all_.

Only if one has to accept the ridiculous arguement that the car engine
should also double as the wheels when needed.

>
> Best wishes,
>
> // Christian Brunschen

-T-

.



Relevant Pages

  • Re: Three Kinds of Logical Trees
    ... that includes all code that uses the dbms api would have all such units ... a variable of type int, with an int value and invoke a method ... ... >> database would have all the data and functions it needs to address ... I have no plans to design a language, but I'm happy to learn anyway ...
    (comp.databases.theory)
  • Re: Another idea from pick goes mainstream...
    ... A relational database has nothing to do ... SQL is one of the ... is now considered THE language for any dbms implementing the relational ... Then, XML came out. ...
    (comp.databases.pick)
  • Re: If you were developing a database in Forth...
    ... database that worked well, and then Dennis Ruffer provided source code ... to do the same thing in some other language and it failed it was ... It doesn't matter if the project was a stunning success or a ... that failure was the language we invented. ...
    (comp.lang.forth)
  • Re: selecting a language
    ... > languages to write our application in, as Progress is becoming ... Progress is a 4GL language so it is very rapid development. ... database access from GUI code. ... Java has a lot of built in support for it. ...
    (comp.lang.java.programmer)
  • Re: OO vs. RDB challenge
    ... >> loans have greater risk so not all users that can approve a loan ... >> database engine for that matter)? ... >> language for somethings, ah well. ... string group) ...
    (comp.object)