Re: Neutral Format as a Coupling reduction idea that doesn't work.



Responding to Carter...

Ever had a problem of two classes / frameworks/ packages / systems....
that have to be coupled in someway.

Info has to pass between the two.

So the obvious gets done and one side or the other gets to use a rich fat
object from the the other side. And WOW! Oh the PAIN! The two sides are
now deeply and horribly coupled by all the things that rich fat object
depends upon.

Often you can't even compile the ruddy thing!

Right. Don't Do That. Object references totally trash encapsulation because the object exposes part of the sending subsystem's implementation. (There wouldn't be a reference to pass if the object wasn't needed to implement the sender subsystem.) There are potentially unlimited and unexpected side effects for both sender and receiver.

At that point some bright soul steps forward and says, its the problem of
having the rich fat object. Reduce the information to a simple neutral
format like a number, string, a CSV file an XML file and we're done.

Grreat. Now we're making fast progress. We can unit test we can validate
the xml we can do Good Things. That worked, we decoupled the two systems.

Did we?

That depends. The mechanics (e.g., XML) are not very important compared to using a pure message paradigm: {message ID, <by-value data packet>}. That greatly reduces the coupling. But there is no way to eliminate it completely. There is going to be coupling any time information is passed. [Among other things there is an inherent timeliness issue: is the data still valid by the time the receiver processes it? It may not be if the subsystem interface serializes through a queue.]


Then why during the maintenance phase do we have all kinds of issues
relating ...
* Semantics - (What exactly did that field mean again?)

Both sides interpret the message in terms of their own context. To do that both sides must share the definition of the message itself. That definition defines the semantics of the just the data packet. Thus field 3 in the message is a voltage in millivolts expressed as a double precision floating point number. But the sender may interpret that as a "pin 5 bias" while the receiver may interpret it as an "channel 16 offset".

IOW, there is a difference between the semantics of the message and the semantics of the subsystem subject matter. At some point these views still need to be coordinated so that the mapping between subsystem subject matters through the message is correct. But that is a problem for the developers of each subsystem when negotiating the interface. That represents a solution level (Systems Engineering level, if you will) coupling but at least it is very narrowly focused on the message definition rather than what each side does with the information. IOW, it is a lot easier to negotiate

sender semantics -> message semantics -> receiver semantics

than

sender semantics -> receiver semantics.

because the level of abstraction of the message is higher.

* Currency - (You changed something? Why didn't you tell us?)

That is not a problem for the message definition. Rather it is a problem for When the interface is invoked. Currency is, indeed, a level of coupling that can go wrong even for a message with no data packet at all. But it is also about as benign as coupling can be. One way to classify the intimacy of coupling in ascending order is:

message w/o data packet. Currency is the only thing that can go wrong. The sender can't be directly hurt and the receiver in unlikely to be permanently hurt, but the receiver can spin wheels. [What the receiver does is out of synch in the solution, but that was already true as soon as the message was sent at the wrong time. IOW, the solution was out of synch as soon as the sender sent the message.]

message w/ by-value data. Add the timeliness and consistency of the data as something that can go wrong. The sender can't be directly hurt but the receiver may produce incorrect results. Still pretty benign and readily controllable.

message w/ data by reference. Add the fact that the receiver can modify the data values the sender uses without the sender knowing or expecting it (aka the Global Data Syndrome). This starts to get potentially nasty because the data is part of the sender's implementation and encapsulation has been broken.

message w/ behavior (e.g., Java applets). Now the applet can do unexpected things to the receiver. That is, the sender can modify the receiver in ways the receiver neither knows about nor expects. This is really nasty, as any web security guru can testify.

message w/ object references. The worst because the receiver can trash the sender's data and the behaviors can trash the receiver. In addition the receiver has access to /all/ of the object properties, not just the ones the sender expects to be accessed. A veritable Pandora's Box of unlimited side effects for both sender and receiver.

<aside on the human condition>
We don't seem to learn from past mistakes. Decades ago there was a language called Forth that essentially allowed the code to modify itself as it was executing. Like FORTRAN's assigned GOTO, that was one of those things that seemed like a good idea at the time but turned out to be a Really Bad Idea. By the late '80s even suggesting using Forth was grounds for summary dismissal in some shops. Yet within a few years that hard-won lesson was lost and people started passing Java applets around indiscriminately.

Similarly, back in the '50s and early '60s markup and scripting languages were very popular. But by the early '70s they were largely abandoned because of reliability and maintainability problems. But in the late '80s they were reborn with a vengeance because interoperability was needed and the Hardware Guys couldn't standardize things like endian.

So now we get to wonder (A) why we need a desktop today with more power than a 1984 mainframe to run a spread*** that ran just fine on a TRS-80 in 1984, (B) why the order entry forms on the web are always breaking, and (C) why some OO applications are much more unreliable than one would expect (i.e., those passing object references around).
</aside on the human condition>

* Encapsulation - (You added something, I don't care why, it broke my
stuff. Pull it out!)

To do that in a pure by-value message interface, one must change the interface definition. That will usually result in rude comments by the compiler if both sides aren't privy to it. So it is a problem but there are pretty good safeguards available in the development process.

* Duplication - Why are we writing this data encoder / decoder stream
twice?

I'm afraid don't follow. The encode is done once on the sender side and the decode is done once on the receiver side for each message.

If you mean that every message needs to be both encoded and decoded, then that is the price of decoupling so that each side can map the common message definition into its own semantics. The encode/decode is essentially a mapping function from the local subject matter context to the common message definition context. The encode/decode may be tedious but it usually isn't rocket science. Better to provide semantics mappings in a simple, mechanical way than in a complex, case-by-case way.


*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
"Model-Based Translation: The Next Step in Agile Development". Email
info@xxxxxxxxxxxxxxxxx for your copy.
Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH



.