Re: aggregate boundries



Responding to Kurbylogic...

OK, so far this sounds like:

              * identified in  R1
[Transaction] -------------------------- [Payee]
     | *                          pays 1
     | identified in
     |
     | R2
     |
     | updates
     | 1                           1 child of
 [Account]              [Category] -----------+
     |                       |                |
     +--------+--------------+                |
              | R3                            | R4
              _                               |
              V         * contains            |
      [CategoryContext] ----------------------+
      + currentBalance

Note that the Category structure is just a simple GoF Composite pattern.
 What I am unclear about is what [Payee] represents.  I assume this is
who the money goes to when a check is written.  You suggest below that
[Payee] may be hierarchically organized as well.  In that case one would
need a similar Composite pattern for [Payee] as well (i.e., the name on
the check is only provided by a leaf).



my ascii uml isn't very good but more like :
(when I preview my the size of my spaces seem to change and the lines
go everywhere,  but if you'll note that transaction category is a many
many relationship with the association class transactioncategory
holding
the amountassigned you might be able to make sence of it)


summarizes <composite> [Transaction] * ------- assigned to * [Category] | * | * | *| | | pays 1 [TransactionCategory] | | [Payee] +amountAssigned | | | * | | | | | updates | tracked by | [Account] * ---+ -------------1 [DataContext] 1 +

A couple of hints for text diagramming. First, make sure you are using a fixed font in your message editor. Second, never use tabs. Third, when reading forums like this make sure you mailer is set to display in plain text rather than HTML. [Some mailers default to a variable font when they convert from the newsgroup plain text to HTML. So this is important when you read other people's diagrams.] Fourth, try to keep it in a 80-character line width. So, I think you tried for (though some dashes and bars seemed to have gotten lost in translation):


>
>                  summarizes               <composite>
> [Transaction] * ---------------- assigned to * [Category]
>  | *       |  *              |                   * |
>  |         |                 |                     |
>  |         | pays 1     [TransactionCategory]      |
>  |         [Payee]      +amountAssigned            |
>  |             | *                                 |
>  |             |                                   |
>  | updates     |             tracked by            |
> [Account] * ---+ -------------1 [DataContext] 1 ---+

I'm not sure which relationship 'tracked by' goes with; probably the one to [Category]. Is [DataContext] an associative objects for the Account/Payee relationship? Or is Payee supposed to be associative to the Account/DataContext relationship? I'm also not sure what the '<composite>' indicates. That [Category] is further decomposed into a GoF Composite pattern?

Payee has a user friendly name (as store numbers and localities are
often appended in ofx files) and provides name matching and automatic
category assignment services.  payee.IsMatch("Barnes & Nobel -
Highlands Ranch"), payee.AssignCategories(transaction) adds a
TransacitonCategory to Books Category with full transaciton amount.
Employeer (a payer rather then payee perhaps but still goes into the
"payee" field on transaction)  however knows my net pay and how to find
the find the budget for that period, the budget will then determine
what portion of the transaciton amount should be assigned to which
categories (ie if I budgeted 100$ for books but only spent 50$ on books
then assign 50$ to books not 100$).  Any amount left over can be
assigned to extra loan payments, savings and/or investments in whatever
way the budget decides is best.  Lenders could perhaps provide loan
amount and amoritization statement for example.

So you are using Transactions to update the budget amounts as well as the debits and credits, right?


I am a little concerned about Payee's and Employer's roles. I would expect the Category hierarchy and linking of Accounts to it to be fixed relative to the budget processing. That is, somebody else (e.g., an external configuration file or table in the DB) keeps track of what Categories there are and what Accounts they summarize. Then for any given execution the corresponding memory objects and relationships would be instantiated by factories from the configuration data. IOW, the Category hierarchy is orthogonal to and maintained by different software modules than the module where Transactions are posted.

I assume the same application is performing both tasks from the same UI. I am just arguing that the maintenance of the Category hierarchy is a quite different subject matter than the transaction posting subject matter, so I would expect it to be a separate subsystem. Then instantiating the data structures for transaction processing would be done as part of the startup infrastructure. [Even dynamically adding Categories would be done by sending a message to the transaction processing subsystem announcing the change. That message would be dispatched to the same factory that does the startup instantiation for that subsystem. So one is just getting the mapping information from another subsystem rather than the DB and the mechanics are pretty much the same.]

I would apply the same sort of separation of concerns to creating the budgets. It seems like you are capturing a reasonably complex suite of rules and policies for budget allocations in the software. (I had assumed that you defined the budgets manually through the UI.) If so, that also seems orthogonal to the actual transaction processing. So I would want to encapsulate it in its own subsystem. Then all this transaction processing subsystem would see are individual Transactions to set the budget amounts at each level and those transactions would be created somewhere else. IOW, I don't see those rules and policies as responsibilities for objects in this subsystem.

My point here is that I see our context so far being a subsystem whose subject matter is bookkeeping via transaction processing. That is, this subsystem isolates the actual record keeping associated with bill paying. As such, I see it as having limited responsibilities. It doesn't maintain the Category structure and it doesn't decide how budgets are allocated; those things are just given to it via data for factory instantiation (Category structure) or via transactions (budget amounts).

A transaction can only have one payee thus in my database transaction
has a foreign key to payee.  The transaction needs to know the identity
of the payee to save itself, payees will be saved prior to transactions
and the identity property updated  However it does seem that payee
needs to know a lot more about transactions (dates, names, amounts)
then transaction needs to know about payee (the name and identity).
Transaction really doesn't even need to know the payee name because the
caller knows about payees and could ask the payee but would still need
to determine which payee to ask, I suppose it could ask all of them if
they contain the given transaction but I think this will create more
problems then it solves.  The only thing it seems transaction really
does need to know is the identity. The name of the payee can be
changed, I might be able to cascade the changes locally and modify the
database gatway to accept the name rather then ID however if the name
of the payee is changed on a different machine the service would not be
able to find the payee by name when the changes are saved.

I think that some of these problems go away if one adopts my more simplistic view of this subsystem's responsibilities. For example, Transaction doesn't need to know who the Payee (or Employer!) is here. All it really needs to understand is the Account being updated and how it should be updated. At most all one needs is an attribute in the Transaction to identify the Payee if one also writes the Transaction itself to the DB. That is, I would expect to see no Payee object in this subsystem; it would exist where the Transaction was defined.


Which brings me to your Transaction/Category relationship. You are associating the Category hierarchy directly with Transaction via the *:* relationship. When I see *:* relationships I look for ways to simplify them because the infrastructure (TransactionCategory) to maintain the association tends to get complicated. However, that sort of reification has to be consistent with the problem space.

In the Money/Quicken model, a transaction is posted to a single Account. (Let's ignore splits to keep it simple.) It is the Account that is defined as a member of a Category hierarchy (e.g., Expenses.Household.Repair). IOW, what one conceptually has is

          *    belongs to 1
[Account] ----------------- [Category]
+ accountID = 147           + categoryName = Expenses
                            ...
                               A
                               |
              +----------------+--------....
              |                |
           [SubA]            [SubB]
           + subName         + subName = Household
                               A
                               |
                              ...

All Transaction needs to know about is '147' to properly address the update to a particular object. The UI just makes things more convenient for the user by providing Expenses.Household.Repair to fill in rather than 147. In addition, I contend that the notion that Accounts are leaves in a Category hierarchy is natural in the problem space. I also contend that from a bookkeeping viewpoint, the Category of the Account is not relevant to correct posting. So there isn't a direct relationship between Transaction and Category. [More precisely, there doesn't need to be. B-)]

Nonetheless, one still has to maintain the connection between a Transaction and the Category hierarchy because that is what the user can most easily grok when addressing a Transaction. Also, the Categories are organized in a hierarchy, so we really should have a structure that reflects that. In addition, the actual Account is only associated with leaves in that hierarchy; the upper levels are used solely for aggregation.

IOW, I am still pushing hard for the Composite pattern view that relates Transaction directly with Accounts and indirectly with Categories. B-)

I am also prepared to argue that when it comes time to report off the structure things will be simpler because the aggregation levels will already be hard-wired into the hierarchy. It will probably be more efficient as well. (Whatever tables you have in the associative object will probably need to be ordered so insertions will require searches or a more elaborate data structure for cross-referencing.)

The problem I think that I'm having is that in OO the object reference
is the identity thus adding unnecessary dependancies and enables the
holder of the idenity/object reference to send messages to the object
directly without going throught the proper channels, whereas in the
database world you might have a the transactionId but you cannot ask
the transaction table to update the transaction with a payeeId it does
not know how to verify instead you must ask the database which knows
about foriegn key constraints, the database asks the payee table to
validate the payeeId and only then sends a message to the transaction
table to update transaction x with the given payeeid.  This indirection
is missing and in my model where updates must go through the
datacontext, but since the identity is also the object reference, once
that identity is given to the client the client can "misuse" that
identity by sending a message directly to the object without going
through the datacontext which knows how to validate the invariants
before setting updating the object with the given identity.

I think the key here is that in the OO paradigm relationship implementation, instantiation, and navigation is orthogonal to class semantics (at least at the OOA/D level). More important, relationships provide the means for capturing rules and policies for access in static structure rather than in behavior dynamics.


Consider my model. When a new Transaction is created one probably has the Category definition of the Account (Expenses.Household.Repair) because that's what comes from the UI. So the factory that creates the Transaction object "walks" the hierarchy downwards from the root -> Expenses -> Household -> Repair. At that point the factory has an Account object in hand so it assigns that reference to a pointer in Transaction to instantiate the R2 relationship in my model (presumably as the constructor is invoked).

Once the Transaction object is created a message is sent to it (e.g., post(...)). Now Transaction does it thing by deciding how to post (e.g., debit/credit vs. delta) and updates the Account at the end of the R2 relationship. At that point Transaction couldn't care less whether that Account is 147, Expenses.Household.Repair, or any other identity that might be of interest to an RDB. That is, Transaction is quite confident it has the right one in hand because it was someone else's job (the factory creating it) to figure out participation and instantiate the relationship correctly.

Transaction then goes on to update the various aggregation levels. To do that it "walks" the Category tree upwards from the Account object at the end of R2 until it gets to the root. For each one it provides the same update. In doing that Transaction needs to know nothing at all about the semantics of the actual categories. That's because it was someone else's job to make sure the tree is properly organized so that when the relationships are "walked" one gets the right objects in hand for updating. Thus properly constructing the GoF Composite structure has statically captured a lot of problem rules and policies so that Transaction can do it job while knowing almost nothing about the objects it updates, much less why it is updating those particular objects.

The result is that the code to do the updates in Transaction will be much simpler (e.g., no conditions to test). There will be some code in the factories that instantiate the relationships, but it will also be simpler because it is highly focused on the invariants of hierarchical structure. As an added bonus one uses exactly the same navigational structure (i.e., invokes the same reference getters in the same basic way) wherever one is in the structure and whether one is instantiating objects or posting.

My major point here is that the collaborations involve relationship navigation via references but that navigation is based on Class Model structure rather than the semantics of the individual classes along the navigation path. There is no way to avoid some degree of dependency when objects collaborate. However, the way the OO paradigm deals with relationships goes a long way towards minimizing the coupling.

[FWIW, I think it is sad that most OO books seem to underestimate the role that relationships play in good OO design. Even fewer point out the crucial role that relationship participation plays in narrowing access to state variables (attributes). Relationships are essentially the OO paradigm's solution to problems of global data management in procedural development. But I digress...]


If I wrap the object the client would only be able to the properties defiend on the wrapper. The best idea I can come up with is to create an object that acts as a suragate identity in place of the database identity, thus my datacontext looks just like my web service interface but uses these surragate objects for identity instead.

Thus:
public class PayeeId {}
public class AccountId {}
...
public class Account {
  AccountId AccountId { get; set; }
  public string Name { get; set; }
}
Account account = dataContext.GetAccountByName("Checking");
transaction = new Transaction(....);
datacontext.Add(account.AccountId, transaction)
/* implemented something like:
{
  if(!accounts.exists(accountid)) throw exception
   if(t.payeeid != null && !payees.exists(t.payeeid) throw exception;
   // other verifications

   //update aco*** and category balances

   // store transaction list seperate from account detial
   IList accountTransactions= _transactions[accountid]
   transaction.TransactionId = new TransactionId();  // assign identity
  accountTransactions.addtransaction(ttransaction.Clone())  // clone
object
}
*/

I think this could work because now I would have something other than
object reference to use as identity thus allowing me to clone the
object and protect it from direct modifications by the client, yet
without datacontext needing to know the various payee subtypes.
Thoughts?

You are getting into a somewhat different area here. The OOPLs are 3GLs and that means that they tend to have problems with physical coupling because of the needs of type system validation. This has made Dependency Management at the OOP level a buzz phrase with entire books written about how to get around the problems to make the OOPL code more maintainable. Your wrapper solution is one such technique.


Alas, I am probably not the best source for dependency management since I am a translationist and don't have to muck around in 3GL code. When one uses a full code generator code maintenance becomes academic. (I probably haven't written 10 KLOC of 3GL code since 1990.)

[Most of what I have been talking about above is at the higher OOA/D level of abstraction where one identifies objects, relationships, and messages. There this sort of thing isn't an issue because the OOA/D doesn't care how identity is implemented (pointers, attributes, array indices, whatever). Thus the abstract action languages for OOA/D use things like relationship identifiers for addressing.]

On replicating the data base...

This is a very different and potentially very nasty problem.
Essentially your application is caching data from the DB than is not in
synch with the DB for extended periods of time.  Money/Quicken don't
have this problem because they update the DB with every transaction
immediately.  (Note that newer versions of Quicken don't even have a
Save button anymore.)  Enter stage left, tripping: DB replication.

You have an added problem due to the complexity (nesting) of your
summaries.  Money/Quicken basically have only one total number to keep
track of at the Account level, the current balance.  Your summaries add
another level of complexity due to the nesting and the requirement that
the subcatagories sum to the supercategory.

However, I don't think the problem is insurmountable if you have a
complete copy of the DB on your laptop.  [If you don't then you will
have a problem reading Transactions for UI display when the base is
unavailable.  B-)]  So long as that laptop copy is internally consistent
for recent activity, you can just do the brute force approach and copy
it to you home desktop or whatever when you get the chance.  IOW, do a
poor man's replication by treating the home server as a backup.


All categories and payees will be copied to the client but only
transactions for the current period.  The reason I don't want to save
changes immediately is I want to be able to see to preview the changes
that will be made when I import an ofx file, if I discover the rules
(auto categorization) that I've defined are not behaving correctly I
can change them without needing to manually fix them afterwards.  I
just like to make things as complicated as possible :) but really I'm
actually trying to force myself to tackle some issues I've some how
managed to otherwise avoid.

Wow. You must have a masochistic streak. B-) This (undo in its various incarnations) is yet another nasty problem.


I don't have much experience here. The last time I did anything remotely related was a What-If economic simulator back in the '70s. For that I just used mirror records in the DB and juked my DB engine to look for mirror records first. One bit of the key was for mode which determined whether is was a mirror record or not. When in the What-If mode all records written had the mirror bit set. When done with the simulation the mirror records were either removed (don't want the changes) or used to update the originals and then updated (want the changes). Not terribly sophisticated, probably not bullet-proof unless What-If scenarios are handled very discretely, and perhaps not as efficient as it could have been. But it worked for that context.

For the posting mechanics to ensure internal consistency you just need
to "walk" the Composite tree above and make the same adjustment at every
level above Account _when you update Account_.  That will ensure that
the numbers are consistent so long as you do the update prior to
processing any other change.  [Note that if the transaction in hand is
changed at the account level you obviously have to post the delta of the
change to Account and the categories.  But that delta will be the same
at every level.]

To do that you just need to keep a lastBalance attribute in Transaction.
 When you create a Transaction or have completed the update of Account
and the Categories for a change, you set lastBalance to currentBalance.
 So the only time they will be different is when you have made a change
to an existing transaction in the UI.  You just need to look for that
when you post to Account to determine whether you need the straight
amount or a delta.  IOW, just make sure you complete the posting before
processing another Return in the UI.



This is what I want to do but unfortunatly the client could perhpas bypass the datacontext by adding the transaction directly to the account without updating the effected categories so I tryied to avoid the issue by storing the data in only one place (borrowing from the relational model) unfortunatly this doesn't work in OO because even if the data is stored only in one place I can't enforce the uniqueness constraints if the client bypasses the datacontext.

Ah. This gets back to my issue about subsystem cohesion. Deciding what Transactions to issue and when to issue them is a whole other set of rules and policies from posting them. In my scenario above all transactions look pretty much the same, have limited scope, and are processed the same way. Then there is only one path for processing the transaction (the actual posting). One either posts a Transaction or not, but the subsystem doing the bookkeeping is not concerned with whether the Transaction is correct or whether it should have been posted in the overall problem context; it just does its thing.


I suggest that if one separates the concerns of posting from those of transaction formation, it will become easier to get the triggering of transactions right. For example, deciding whether lastBalance should be the same as currentBalance becomes a clearer decision based on where the data is coming from. I'll assume any changes come through the UI. Then in the UI if the data displayed originally came from the DB (i.e., the UI was initially given the data by some other subsystem), then one doesn't touch lastBalance (which will not be visible in the UI). One can keep track of the source in the UI easily via an attribute when a display window is created. If the data displayed comes solely from the user, the UI copies currentBalance to lastBalance before shipping it elsewhere. Thus the rule is simply and easy to apply in one place.

Even if there are automated contexts where a transaction might be generated without user input, I would bet that you know whether the transaction is new or not at that point and the same logic would apply.


************* There is nothing wrong with me that could not be cured by a capful of Drano.

H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions  -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
(888)OOA-PATH



.


Quantcast