Re: Singletons



Responding to Sasa...

What does it mean "forms normal relationship that always lead to that instance" and "modifies the context where instantiation occurs"?



All OO collaborations are done by navigating relationship paths among objects (i.e., the associations in a UML Class Model). There are four basic ways to implement a relationship: embedding an object in the implementation of another object; employing a referential pointer; passing an object reference as a message argument; and using an RDB-style search of instances by explicit identifier. The first is used when only the container needs access to the object. The second is the most common (e.g., it is the default for languages like Java). The third is used for temporary relationships. The last is very rarely used in OO applications.

Relationships are always implemented and navigated when addressing collaboration messages. Here the relationships that are relevant to solving the problem will naturally end up with all paths going to the same object instance because only one is created and the relationships are instantiated after that. This is very basic OOA/D. I would suggest getting any standard text on OOA/D for a detailed description.

By modifying the context I meant that one defines the solution flow of control differently so that the instantiation can be done in one place rather than in several (or within an iteration). Thus one moves the context within the solution for where one needs to instantiate the object. In one of the examples, that means moving the instantiation of ErrorFile from multiple subsystems to a single location in main().


And how would one in such example pass that instance to other interested participants (subfunctions, objects)?

The most common way to implement relationships is via a reference attribute. Such attributes are usually initialized by the constructor or a factory object when the object in hand is created. So one would pass the reference to the constructor/factory object.

What if in the future for some reason you actually want to change that, and have different contexts trace to different files? Wouldn't it be fairly difficult to refactor the singleton based solution?

Probably not. Singleton is really just a specialized factory object. The clients effectively ask it to create an object and it returns the singleton reference _as if_ it had been newly created. So if you need more instances you substitute a different factory object.

Singleton to me is not a factory. It is the one and only instance of the object. The GetInstance() returns reference to a single instance of the object. It may create it on the first access in case of lazy initialization, but its purpose is to retrieve it.

When the client is invoking GetInstance, what is it doing? It is creating an instance from its perspective. The fact that Singleton only actually creates an instance once is not relevant. It is still acting as a special kind of factory object.

>

Suppose the object being created is a subclass and each sibling subclass should only be instantiated once. One could modify Singleton trivially to do that where a parameter tells it which subclass to create. It creates one if there isn't any; otherwise it just returns the existing instance. Now Singleton is actually creating multiple objects, just like a factory object. Thus there is no substantive difference between Singleton and a factory object from the client's perspective.

IOW, the only difference between Singleton and a factory object is in the actual number of instances it creates _in its implementation_. There is no way for the client to know whether the instance returned is one of one or one of many. Think of it this way: for the first invocation of Singleton or a factory object, what difference is there?


Only in the lazy load variant. Singleton could be unconditionally instantiated as a global, and simply retrieved via GetInstace. Always.

But that's not the way Singleton works. There are actually two objects involved in Singleton. GetInstance is a method of the Singleton object, which is not the object that the client actually collaborates with through the reference. To access that method there must already be a relationship that can be navigated by the client to get to the Singleton object to invoke GetInstance. (Some poorly formed OOPLs like C++ allow one to create global references directly for cases like this but most don't.)

The Singleton.GetInstance method then returns a reference to the object that the client actually collaborates with. The Singleton instantiates that collaboration object as needed. The reference that the Singleton returns instantiates a temporary relationship to the collaboration object. So what one has is:

0..*
[Client] ----------------------------+
| * |
| |
| R1 |
| |
| gets reference from |
| 1 |
[Singleton] | R3
| 1 |
| |
| R2 |
| |
| creates |
| 0..1 0..1 accesses |
[Collaborator] ----------------------+

When Singleton passes back the reference to Client it is temporarily instantiating the R3 relationship between Client and the Collaborator that Client navigates when it talks to Collaborator (i.e., allows Client to address messages to the right Collaborator).

Bottom line: when Client accesses Singleton.GetInstance, it does that via exactly the same sort of relationship navigation that it would employ to access any factory object. IOW, there must be a relationship path to Singleton just as there must be a relationship path to a factory object.

Note that most factory objects are only instantiated once, typically during subsystem initialization. When one instantiates instances of the clients that invoke the factory objects, one also instantiates a relationship between them and the factory object (usually a pointer reference). One does exactly the same thing with a Singleton.

Also note that the GoF themselves tacitly view Singleton as a factory object because they group it with the other factory patterns in the Creation Patterns category.

So while I can agree that substituting with different factory might work with tweaking, I don't find it very elegant.



I think it is quite elegant from the perspective of maintainability. The change is entirely encapsulated and completely transparent to any existing clients. All one is doing is changing the /implementation/ of the Singleton object.


Consider following: you have bunch of code relying on the singleton. Now due to new requirements you realize that newly written code must use different (but only one) instance. It basically means you have two sorts of code. How will you replace singleton with factory? How will the factory know which instance to return?

I don't think any code changes outside of the implementation of Singleton itself. The client invokes Singleton.GetInstance and gets a reference back that it collaborates with. If you change the implementation of that Singleton object so it becomes a simple factory object that creates a new instance each time, it isn't a Singleton pattern anymore. Instead it's something like Concrete Creator in the Factory Method pattern. However, that is completely transparent to the client because the Client still invokes the same GetInstance method, it still gets the right instance of the collaboration object, and it still collaborates with that object in the same ways. IOW, the Client is completely unaffected by the change.


That problem can actually be solved without Singleton but there are other contexts that are more difficult. For example, in a networking situation one may need to send multiple message packets for a single message. Typically one wants to open/close one channel around the message and use it for each message packet. If the most common mode is for single packet messages, one might try to open a channel at the packet level. Singleton then provides a performance optimization by simply returning the existing instance handle when there is more than one packet.

I don't understand - why wouldn't one always open the channel at the message level?

Because many network protocols limit the size of the message so it has to be split up into separate packets. Those packets will be sent rapidly so one can avoid the overhead of channel allocation if one sends them all over the same channel. Also, depending on the configuration, using the same channel may make it easier to keep track of what packets need to be reassembled into a single message on the other end.

I still don't understand, why can't you open the channel at the message level. The packets are parts of message, hence all packets will use the same channel.



That is exactly my point. One does not want to open/close channels around packets. One wants to do it at the level of messages. But breaking up a message into packets may be done in a different place (maybe even a different subsystem) that doesn't understand channels. So one has to deal with channels in the context where they are relevant, which is dispatching packets to the network.


So why not form MessageContext class or something which encapsulates Packet?

Or - why not do following:
a) packets[] = BreakMessage()
b) channel = CreateChannel()
c) for each packet in packets
SendPacket(packet, channel)
?

My point is that the context for doing (a) may be in one part of the application while the context for doing (b) and (c) may be in another part of the application, possibly a different subsystem where CreateChannel and SendPacket process one packet at a time.

That sort of separation of concerns is not uncommon. BreakMessage may need to understand the content of a Message to break it up properly. That is at a higher level of abstraction than the rules and policies of shipping packets off to the network. In the latter context one does not need to know anything about the content of the packet but one needs to know a lot about channels. To make the application robust one needs to decouple those concerns as much as possible.


There is an analogy to database transactions. In the UI of the application that uses the DB, there is no need to know anything about DB transactions. However, the developer always finds some mapping that is relevant (e.g., the user hitting Save in a particular window triggers the close of an open DB transaction). The UI subsystem simply announces that the Save button has been hit. In the subsystem that understands RDBs that announcement is mapped into a closeTransaction action of some sort.


I cannot relate to that analogy. Sorry. IMHO Save button can only relate to Open/Close pair. Not to close.

I'm afraid I don't understand your point. DB transactions span multiple DB update queries. The user is supplying data to construct those queries through a sequence of discrete activities in the UI that have a beginning and end. In the UI there will have to be some unique user action that signals the beginning of such a sequence that maps to opening a DB transaction. That is, there /must/ be separate actions in the UI that define the beginning and end of the sequence that correspond to opening and closing DB transactions.

Yes. Strategy should be stateless in most situations. It is a delegation of behaviors. It gets its data from the Context object.

What if Strategy subclass uses member variables to keep its internal state which is irrelevant to the outer world (some accumulating data etc)?

Then you would need to have multiple instantiations. My point is just that that is rather unusual. The much more common situation is something like an Employee object where one needs different strategies for computing the benefits in a payroll system. Each strategy will eat exactly the same data, which is all contained in Employee.


Consider following:
You make the framework which relies on stateless strategies. You publish it. How will you enforce the users to avoid member variables (other than through documentaion)?

I'm afraid don't understand where you are going with this. One utilizes the Strategy pattern to resolve very specific dynamic relationship problems. So the implementation of a specific application of the Strategy pattern will be highly tailored to a very specific delegation context. So I don't follow where frameworks and other large scale infrastructures enter the picture.


My point generalized - what each strategy will do is really not known. The only thing one can know about the strategy is that it will implement the interface.

The interface for a particular Strategy pattern implementation is defined in the <polymorphic> interface of the Strategy object.

In many situations instantiating a [Strategy] object every time it is needed would be inefficient because of the overhead of heap operations.

And in many situations it is no issue at all.
I do think though, that instantiating it always is simpler approach of the two and would devise some way of resolving performance issues only when they arise.

You don't do a lot of R-T/E, do you? B-)) In that world counting cycles is often critical.


No I don't :-(
But I do speak from my (very limited) experience.

However, from an aesthetic viewpoint I would argue that professionalism requires at least evaluating the alternatives. If it is too much


Agreed.

trouble for the benefit, then fine. In the case of Strategy (when one doesn't /need/ multiple instances), though, I would bet that in most


Refer to the limitation of each strategy keeping its internal state through member variables.

IME the vast majority of situations where one applies the Strategy pattern will naturally have all state variables in the Context object.

cases the total number of executable statements will be the same or fewer if one does only one instance. The performance gain comes from /where/ the statements are, not changing them. So there is no justification for not doing it right to improve performance (however small the gain may be).


It imposes some limitations on the clients (concrete strategies), which cannot be enforced by the client logic (strategy client), and it makes the provider of strategy implementation somewhat more complicated so to me the question really amounts to- is it justified?

How? The way the GoF implement the pattern delegation the Client doesn't need to know that the Strategy objects even exist. That is a private matter between Context and Strategy. The DbC contract between the Client and the Context is exactly the same as it would have been without delegation.


<aside>
We have a generation of developers today that has lived with Moore's Law so long that they believe all performance problems are solved by getting a bigger and better computer. As a result they don't even know how to


I won't say I'm one of those, but I do get slightly irritated (sorry for using harsh words) when people are using statements like "We might have performance problems there" (which then leads to devising complex schemes of avoiding virtual performance problems). IME these statements always came down to "we never had performance issues there" - not because of the premature optimizations, but simply because performance issue was never there. Whatever happened to that "cut once measure twice" thing?

think about performance issues. Even Moore doesn't expect that Law to hold forever and when it hits an inflection point there is going to be a lot of gnashing of teeth as developers re-learn the hard-won lessons of the past.

In fact, most performance optimizations at the 3GL level are mostly a matter of proper style and mindset. IOW, there are almost idiomatic. Usually it doesn't take significantly more time and effort to optimize.


IMHO almost every optimization will complicate things if only a little bit. The time required in initial development may not be significant, but keep in mind that developers revisit that code and have to deal with greater complexity (however small).

If the tactical optimizations don't require any more code, as in the Strategy case, then why would there be problems subsequently? I would argue that by dealing with the instantiation properly one would actually make it easier for maintenance in the future because the instantiation will be better encapsulated.

Note that one reason we use Singleton and the rest of the creation patterns is because we want to isolate and decouple instantiation issues from collaboration issues. The rules and policies for participation in collaborations are quite often different than the rules and policies governing the collaboration itself.

It is only at the level of OOD where one makes strategic decisions (e.g., employing a write cache) where improving performance can lead to substantially more complex code. Then one has a valid trade-off vs. performance. And even then one can often implement so that the solution is reusable.

Rather than running out and buying a new computer every year people should start to ask why the Visicalc spreadsheet that ran on a TRS80 with 100 Kb of memory, floppy disks, and a clock rate measured in kilocycles can't possible work today without megabytes of memory, gigabytes of hard disk, and a gHz machine. Most of the problem is feature bloat, exotic UIs, and OS bloat, but a significant part is still due to the failure of developers to think about performance, much less write high performance code routinely.
</aside>


IMHO, given modern computer and prices, one should define the minimal configuration and acceptable performance expectations, monitor performance constantly (ideally via automated tests) and correct the issues where performance is not satisfactory.

That approach does not work well in practice IME. One problem is that performance problems often don't show up except under heavy load situations. Those are difficult to setup and tend to require a lot of time to execute, which precludes continuous performance monitoring. For example, proper load testing sometimes requires throwing randomized inputs at the software for tens or even hundreds.

A more insidious problem that that failing to tactically write high performance code routinely is a cumulative problem that spans the entire code base. By the time one realizes that a performance problem is imminent, one is faced with a massive refactoring of all the code.

The sorts of strategic design decisions that can create performance problems usually require substantial structural rewrites after the fact; essentially one has to completely rewrite the relevant code. So one is usually much better off doing some up-front analysis and then relying on prototyping to resolve any worrisome situations. IOW, one measures twice so that one only needs to cut once. B-)

The question still remains - when does one need Singleton?



I think one needs it when:

(A) there must be only one instance AND

(B) there is no feasible way in the solution to reduce the number of opportunities to instantiate an instance to 1. Specifically, there is no reasonable way to avoid:

(B1) Instantiating the object in different parts of the solution and/or

(B2) Instantiating the object within an iteration of some kind.


See above.

I'm, not sure what you are referring to.


*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
"Model-Based Translation: The Next Step in Agile Development". Email
info@xxxxxxxxxxxxxxxxx for your copy.
Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH



.



Relevant Pages

  • Re: Singletons
    ... But each context may be in a different subsystem and thinks it should initialize the error file when it is initialized. ... Singleton is really just a specialized factory object. ... If the most common mode is for single packet messages, one might try to open a channel at the packet level. ... Because many network protocols limit the size of the message so it has to be split up into separate packets. ...
    (comp.object)
  • Re: Singletons
    ... But each context may be in a different subsystem and thinks it should initialize the error file when it is initialized. ... Singleton is really just a specialized factory object. ... If the most common mode is for single packet messages, one might try to open a channel at the packet level. ... Because many network protocols limit the size of the message so it has to be split up into separate packets. ...
    (comp.object)
  • Re: Singletons
    ... But each context may be in a different subsystem and thinks it should initialize the error file when it is initialized. ... Singleton is really just a specialized factory object. ... If the most common mode is for single packet messages, one might try to open a channel at the packet level. ... Because many network protocols limit the size of the message so it has to be split up into separate packets. ...
    (comp.object)
  • Re: Singletons
    ... But each context may be in a different subsystem and thinks it should initialize the error file when it is initialized. ... Singleton is really just a specialized factory object. ... If the most common mode is for single packet messages, one might try to open a channel at the packet level. ... Because many network protocols limit the size of the message so it has to be split up into separate packets. ...
    (comp.object)
  • Re: Singletons
    ... Such attributes are usually initialized by the constructor or a factory object when the object in hand is created. ... There are actually two objects involved in Singleton. ... To access that method there must already be a relationship that can be navigated by the client to get to the Singleton object to invoke GetInstance. ... When Singleton passes back the reference to Client it is temporarily instantiating the R3 relationship between Client and the Collaborator that Client navigates when it talks to Collaborator. ...
    (comp.object)