Re: Business objects, subset of collection



select i.invoiceid
from invoice i join payment p on i.invoiceid=p.invoiceid
group by i.invoiceid
having sum(p.amount) < i.amount and datediff(now(), i,duedate) >= 10

This is the kind of code I write every. Even though the number of
invoices and payments are very high, the queries perform within a
number of millis. The customer is happy, I am happy.

Once again, if that is all you are doing, that is CRUD/USER processing;
you are just moving piles of data back and forth between the UI and the RDB.

Maybe you can show your code performing the same task.

You want code for inventory forecasts? A Linear programs to allocate
advertising budget to various markets' media? A simulation model of
atmospheric diffusion?

No. I was asking for the OO equivalence for the SQL statement above.

That paradigm is fine for generic data storage and
access but searching large sets sucks for algorithmic processing.
SQL databases sucks for searching large data sets, come on...
You don't deny my assertion that I can perform the same O(log N)
optimization in the implementation of a <reusable> collection class.

Of course not. The difference is that you have to do it by yourself. I
can reuse existing tools instead. That is the main difference between
using a database and not using a database.

And, as I have said at least twice, the price one pays for superior
performance is hand crafting of the optimization.

There are areas in which B-trees are not sufficient. But if you are
claiming superior performance, I think you have to be very careful
about telling in what context. O(log 1000) is not superior to O(log
10000).

In CRUD/USER
processing that is largely irrelevant because (a) data is accessed once

If the complexity of the alogrithm is the same, the numer of access
are irrelevant.

So one /must/ be able to achieve more efficient searches
given OO's object-level instantiation. The price one pays for that
efficiency is that the object-level instantiation has to be hand-crafted
based on the particular problem context.

Yes, I agree that by writing the corresponing code by your self, you
might get faster applications than you would if you had used a SQL
database.

But "usually" development time, cost more than CPU time. And the only
thing you have showed is that databases has complexity O(log A) and a
tailored solution O(log B), there B < A. This is not a very big deal,
compared to the "extra tons of keystrokes", you would have to do.

It /is/ a big deal when the same data aggregate is accessed many times.

log 100000 = 5, log 1000 = 3, it is not that big deal

Caching is managed by the database. Caching is not the concern for the
application designer.

Nonsense. Every non-CRUD/USER application I have ever seen had a
client-side cache in some form. By the time the requests get to the
DBMS, it is too late. The DBMS caches just manage server access and
resources (unless you are talking about memory mapped OODBs). One caches
on the client side to minimize DB access.

Another way to elminiate the inter-process communication overhead is
to use stored procedures for "complex processing". That is why most of
TPC (www.tpc.org) implementations rely heavily on stored procedures.
Obviously you could also use a memory mapped SQL database.

If one used the
same accessing paradigm internally in the solution as the RDB uses the
application would brought to its knees.

There are scenarios, there mainstream SQL databases wouldn't perform
well. But the do excel in many application areas. The major problem
with current SQL databases, is the limited set of index types that is
used. B-trees is used as a one-size-fits-all solution. Obviously
better support for other index types is wanted. I can imagine
scenarios there foreign keys implemented as pointers would be a good
thing too. As a matter of fact, I think such databases exists, even if
I don't have time to find references to support this claim.

SQL databases excel in CRUD/USER processing, which is what I keep
saying. Conversely, they suck once one gets out of that realm.

Since your definition of CRUD/USER since to be extremly wide, I am
quite happy with this statement.

As an example, I play an MMORPG. The game employs an RDB, a star
client/server model, and thin clients. That is a great architecture for
airline reservations systems but it makes the MMORPG non-scalable.
That's because the same data is constantly being accessed and updated by
groups of clients involved in the same interactions (i.e., many times
per second).

I don't know anything about MMORPG, but doesn't a airline reservation
system constantly access the data too?

So the cardinal rule of complex application
development is to read the data once and write it once, no matter how
many times it must be accessed in the solution.

Read and write once from what, disk? Or RAM?

The DBMS. Disk seek is the big problem but the table-level searches are
still an important problem when the same data is accessed repeatedly in
a single problem solution. (Or there are many possible relationships
among the data or the data is constantly being updated.) It is also
important to be able to convert identity into formats where one can use
more efficient data structures, like arrays.

For business applications anyway, I think it would be difficult to
find scenarios there it would really be necessary to use such low-
level data structure as an array.

But the 'n' in
O(log n) will usually be much smaller in the OO application because the
collections are object-based rather than class-based.
Lets say you want to find all unpaid invoices. Why would the n be much
smaller in a OO solution?
I said, "usually'. You are postulating a class-based search as a problem
requirement.
You might think that my example is too extreme, but isn't it good to
use a method/tools that doesn't limit you to work on small amounts of
data?
Who is limited to working on small amounts of data?!?

You complained about my example, since it was a class-based search,
and not a search which could be solved with a limited number of
objects in a collection. So why don't show how you would find all
unpaind invoices?

I've already point out that you are postulating a requirement that
/requires/ a table level search.

I just took an example from reallity. The point was to show that your
"usually" disclaimer is debatable. In reallity, there exists a lot of
scenarios there "table level" searches as necessary. You may like it
or not.

[I have also pointed out that even in
that situation there are possible ways to avoid it, such as a collection
dedicated to just unpaid invoices.]

I could show the complete solution with 4 lines of code. The reason
you are only pointing your solutions, is that it needs much more lines
of code. That is what I wanted to point out. Besides, how would you
solve this part: "datediff(now() - duedate) > 10"?

//frebe
.



Relevant Pages

  • Re: Business objects, subset of collection
    ... might get faster applications than you would if you had used a SQL ... That is why databases using caching. ... there mainstream SQL databases wouldn't perform ... Lets say you want to find all unpaid invoices. ...
    (comp.object)
  • Re: Business objects, subset of collection
    ... Non-CRUD/USER applications are not the sort of thing one tucks into eMails.] ... SQL databases sucks for searching large data sets, ... In CRUD/USER processing that is largely irrelevant because data is accessed once by each solution and seek time dominates the access performance issues in those situations. ... Lets say you want to find all unpaid invoices. ...
    (comp.object)
  • Re: XMLPropWorks changes question
    ... XML databases ... utilizing SQLite instead of a full blown client/server SQL system ala ... What's the opinion of those out there in the know about SQL databases? ... A beta version of XPW available for testing in a couple of weeks. ...
    (comp.cad.solidworks)
  • Re: Object/Relational Mapping is the Vietnam of Computer Sci
    ... Could you spare just a moment to give me your opinion on which way to go? ... One *cannot* rationally say that "I think relational databases ... fashion, hype, or religion over science and knowledge. ... lot of people start looking at other options than SQL databases ...
    (comp.lang.ruby)
  • Re: too much OOP ?
    ... RA is proven to be very useful for common data management tasks. ... But it is the only availible way in current production ... Current mainstream SQL databases targets "business ... But there also exists embedded databases which target ...
    (comp.object)