Re: SQL



In article <1137996532.423030.167400@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
topmind <topmind@xxxxxxxxxxxxxxxx> wrote:
>(Part 1 of reply)
>
>> Persistent data is data that is stored externally between executions of
>> an application. RDBs are a response to that need combined with a
>> requirement that access be generic (i.e., the data can be accessed by
>> many different applications, each with unique usage contexts). That's
>> what DBMSes do -- they manage persistent data storage and provide
>> generic, context-independent access to that data storage.
>
>
>I disagree. You can use them that way, but I tend to view them as an
>"attribute management system". They do well modelling "things" in the
>real world (and virtual things) by keeping track of attributes of them.
>They also provide important and useful services such as concurrency
>management, joins (cross-referencing), sorting, and aggregation (sums,
>counts, averages, etc.) Mere persistence does NOT have to include
>things such as joins and aggregation, making one do them in app code
>instead.
>
>Back in my desktop-DB days, I created a lot of temporary tables to do
>things such as joins, filtering, and aggregation for task-specific
>temporary uses. The results were not kept beyond the task/module. Thus,
>I was using DB tools *without* any sense of "lasting".
>
>What would *you* call that? "Persistence" does not apply there.

You are still using a persistence mechanis; you're just choosing to not
use it for persistence. A file system is a persistence mechanism; lots of
applications put data into temporary files, which they delete after
they're done with them, sometimes because the available storage in the
filesystem is larger than in main memory (Photoshop, I believe, used to do
that, and maybe still does). This non-persistent use of the filesystem
doesn't make a filesystem any less of a persistence mechanism.

The relational model, and SQL, were developed specifically for persistent
databases. You can use them in a non-persistent manner, but that is
essentially using them contrary to their original intent and purpose.

By the way, how do you create and destroy these temporary tables? i.e.,
does the dbms manage their lifecycle for you, creating them as necessary
and removing them when you no longer need them, or do you have to perform
either or both of those steps yourself?

>> My point in this subthread is that such responsibilities are complicated
>> enough in practice that one does not want the DBMS to also manage and
>> execute dynamic business rules and policies. IOW, the DBMS should just
>> mind its own store. [This thread has been a veritable hotbed of puns.
>> I've probably made more in this thread than I've done in the last
>> decade. B-)]
>
>I agree that code does some things better and DB other things, and one
>uses them *together* in a Yin-Yang fashion. They compliment each other.

Relational Databases and SQL are tools for the specific task of storing,
accessing, modifying data - they are single-purpose tools (see definitions
below). They are 'general-purpose' _within their specific area_, as they
are not specific to any particular data access/storage/manipulation tasks,
but they are 'general-purpose' _only_ within their _specific_ domain,
which is the storage of data. A 'general-purpose' programming language is
one that allows one to write solutions to essentially arbitrary problems
using it, possibly with some specific exceptions (such as, 'python is a
general-purpose programming language, but due to its interpreted nature,
it shouldn't be used for writing interrupt handlers').

One thing to remember is that a RDBMS does _not_ do _anything_ that one
can't do in code on one's own - they are essentially just a pre-written
library, with a little domain-specific language as part of its interface -
whereas on the other hand, _most_ of the things you can fo in _code_,
_cannot_ be done in an RDBMS.

[ ... deletia, as I know nothing about UML ... ]

>> > Again, this gets into the definiton of "general purpose". I agree that
>> > query languages are not meant to do the *entire* application, but that
>> > does not mean it is not general purpose. File systems are "general
>> > purpose", but that does not mean that one writes an entire application
>> > in *only* a file system. It is a general purpose *tool*, NOT intended
>> > to be the whole enchilata.
>>
>> Huh?!? If you can't write the entire application in it, then it isn't
>> general purpose by definition.
>>
>> > A hammer is a general purpose tool, but that does not mean one is
>> > supposed to ONLY use a hammer. You need to clarify your working
>> > definition of "general purpose", and then show it the consensus
>> > definition for 4GL.
>>
>> huh**2?!? A hammer is not a general purpose tool by any stretch of the
>> imagination.
>
>Okay, then what is a "general purpose tool"? If I was going to put
>together a tool box for a trip where the mission details are not given
>ahead of time, I would certainly pack a hammer.

Yes, but you wouldn't expect to be using the hammer _unless_ you
encountered a problem that _specifically_ included nails. If you
encountered problems that included only screws, you'd never touch it;
you'd be using your screwdriver set instead.

>Only an idiot would
>not. No, it is not a one-size-fits-all tool, and I don't expect one.
>Good apps don't need a one-size-fits-all language because they can use
>yin-yang complimentary tools.

*sigh* Time for some rudimentary definitions:

single-purpose
useful for a single purpose, for a specific task, only

multi-purpose
useful for a number of specific tasks, but only a limited number still

general-purpose
useful for most tasks in general, though possibly with some specific
exceptions

all-purpose
useful for absolutely everything

A hammer is _not_ a 'general-purpose tool', because it isn't intended
or useful for general tasks, but specifically for beating nails into
stuff. If it also has a back end that lets you pry nails out, it might be
considered a 'multi-purpose' tool, but even that would be a stretch,
because you're still only working on the 'nails' bit. Either way, it's not
a 'general-purpose' tool by any stretch of the mind.

Actually, there's a saying that's apt, and which I think that 'topmind'
very clearly exemplifies:

"If all you have is a hammer, everything looks like a nail"

The point of which, of course, is precisely that a hammer is _not_ a
general-purpose tool, but that if you are accustomed to working only with
a specific limited toolset (and thus the associated set of problems it is
intended to solve), there is a tendency to try to see all other problems
as if theyy, too, were problems of that specific type. However, that is a
fallacy, as there are problems which clearly don't fit into such a narrow
mold.

Even topmind's own comments above show that he is fundamentally aware of
this: he says that a hammer is _one of_ the tools he would pack, so he
recognizes that there are many more tasks that he might encounter, but for
which a hammer is not a useful tool. But he confuses 'general-purpose'
with 'one-size-fits-all': Those are _not_ the same.

Of course, when it comes to computers, 'general-purpose' can frequently
come very _close_ to being 'all-purpose' simply because there are very few
problems that fall outside the 'general-purpose' area.

Java is a 'general-purpose' language, because you can write all sorts of
programms in it - from math-intensive scientific number-crunching, to data
storage and access, to graphical user interfaces, to distributed systems,
to ... etc. SQL _isn't_, because there are _vast_ areas of problems that
SQL not just osn't intended to address, but simply _cannot_ address,
however much you try to make it.

The need for 'yin-yang-complementary' tools arises in cases where it it
difficult to create multi-purpose or general-purpose tools: construction,
woodworking, metalworking etc, are all places where it is difficult to
create such tools. In computing, however, such tools abound: The computer
_itself_ is a gener-purpose tool. Let us not forget, in fact, that today's
digital stored-program computers are actually _general-purpose_ computers,
as ooposed to earlier, single-purpose computers, that were available (and
some still are!), such as for solving differential equations.

Procedural, funtional, object-oriented languages are all _general-purpose_
tools for programming computers, for writing essentially arbitrary
programs. SQL _isn't_. If it were, then why do database vendors create
languages to extend or 'hook into' the database (Oracle's PL/SQL), or
allow other languages to hook into the database (PostgreSQL allows perl
and python, as well as others, I believe)? There would be no need - *if*
SQL were a _general-purpose_ language - which it isn't.

And again, 3GL can be used to _write_ RDBMS; the converse is _not_ true.

>> > Please clarify. Something can still be within a paradigm and be general
>> > purpose. Further GP does not necessarily mean "all purpose", for
>> > nothing is practially all purpose.

While in computing, 'general-purpose' can quite frequently get reather
close to 'all-purpose' (i.e., the set of specific problems that a
'general-purpose' language cannot solve can grow very small), there is a
difference, yes; however, the difference between a single- or
multi-purpose tool and a general-purpose tool is _also_ still there.

Certainly, tools can be within a certain paradigm, and still be
general-purpose. For instance, object-oriented languages are within the
object-oriented paradigm, and are still general-purpose; likewise,
functional programming languages (like Haskell) are within the functional
paradigm, and are still definitelly general-purpose languages. However,
that is more because 'object-orientation' and 'functional programming' are
both paradigms specifically for general-purpose programming.

The relational data model is specifically intended for data storage and
access; it isn't intended to address anything beyond that. Any language
that is based on that specific model is going to be working within the
limits of what that model is intended to address. So, for instance, SQL is
great at modifying and accessing data in a relational database ... but is
not useful for anything else, because that is outside the scope of its
purpose.

So, whereas SQL is a 'general-purpose' language _within the scope of
database access_, it is a _single-purpose_ language if you view it from a
wider scope than that.

>> SQL is designed around the RDB paradigm for persistence. It can't be
>> used for, say, accessing lines in a text flat file because the text file
>> is not does organize the data the way SQL expects. So SQL is not a
>> general purpose interface to stored data. Apropos of your point,
>> though, SQL is quite general purpose for accessing /any/ data in a
>> uniform way from a data store _organized like an RDB_.
>
>Well, I agree that SQL is probably not a very good way to reference
>free-form text. However, just because it is not good for everything
>does not mean it is not general purpose. Again, NOTHING is good at
>EVERYTHING. Do you claim that there is something that is good at
>everything? No? I didn't think so.

Actually, for programming _everything_, you can always use machine code.
After all, that is what everything else comes down to: Everything that
you can do on a computer, you can do in machine code.

In order for a programming language to be 'all-purpose', it would only
have to be able to express everything you can express in machine code.
There are a lot of languages that come close - those are usually
'general-purpose' languages.

SQL does _not_ come close, due to inherent limitations in SQL.

>By the way, I have created tables similar to this:
>
> table: textFile
> --------------
> fileID
> ParagraphID
> SentenceID
> token (word or punctuation)
> tokenType (punctuation, word, non-printable, etc.)
>
>(Non-printable characters are represented in Hex notation.)
>
>It can be done.

But then you have taken what was an arbitrary stream of characters and
already broken it up into a representation that better suits your specific
model. And I suspect that the code that would read an arbitrary stream of
characters and puts it into your tables, was written in something _other_
than SQL - because SQL is not good at the free-text processing that is
necessary to get the data into a tabular format, because SQL is not
general purpose.

[ ... more deletia regarding UML ... ]

>> If the persistence is an RDB, then the subsystem implementation will
>> <probably> use SQL. If the persistence is flat text files, it will use
>> the OS file manager and streaming facilities. If it is clay tablets, it
>> will use an OCR and stylus device driver API. That allows me to plug &
>> play the persistence mechanisms without touching the application
>> solution because it still talks to the same interface regardless of the
>> implementation of the subsystem.
>
>Flat files don't have near the power. That is a poor analogy. RDBMS are
>MORE than persistence. Say it over and over until it clicks in. Just
>because YOU use it ONLY for persistence does not make it the only way
>to build systems, just the bloated reinvent-the-wheel way JUST so that
>you can claim UML purity.

Everything that an RDBMS does, can be implemented in a 3GL using flat
files for the low-level persistence layer. How do I know this? Because
that is one way to implement a RDBMS.

The relational model offers a useful, unified interface to data storage
and retrieval, with well-defined semantics - in such a way that it is a
very good solution to a specific problem; it is in fact, a
'general-purpose' solution _within its scope_ (much like a hammer is a
general-purpose tool for hitting things, and isn't inherently limited to
specific types of nails). But viewed from outside its scope, SQL only
addresses that particular problem, and thus isn't general-purpose.

The power of the combination of flat files and a 3GL, is greater than the
power of SQL - because the combination can implement all that SQL can,
_and much more besides_.

[ ... more UML deletia ... ]

>> > And as somebody pointed out, one can use SQL on flat files too. ODBC
>> > drivers can be created to hook SQL to spreadsheets, flat files, etc.
>>
>> Only if the data is organized around embedded identity and normalized.
>> Even then such drivers carry substantial overhead and tend to be highly
>> tailored to specific applications. IOW, you need a different driver for
>> every context (e.g., a spread***) and then it won't be as efficient as
>> an access paradigm designed specifically for the storage paradigm.
>
>So? What is the grand alternative? Of course different "devices" are
>going to need different drivers. That is a given. There is no
>one-size-fits-all driver. I have no idea what alternative you are
>envisioning, but it is probably in the category with unicorns and
>bigfoot.

And these 'drivers' would be written in ... ? something like a 3GL,
procedura, funtional or object-oriented, usually. You can't, for instance,
write a flat-file database driver in SQL, can you?

SQL, RDBMS, are all about offering a specific view of data, and a specific
standard interface to that data, to accessing it and modifiing it
(insertion, deletion, updates). This solves the 'data storage' problem for
a large number of aplications, and is, as such, a 'general-purpose data
storage' solution - but it addresses _only_ the data storage/retrieval
problem, whereas most other problems also have many other aspects that
need to be solved - which SQL can't do.

Best wishes,

// Christian Brunschen
.


Quantcast