Re: Singletons



H. S. Lahman wrote:
Responding to Sasa...

What does it mean "forms normal relationship that always lead to that instance" and "modifies the context where instantiation occurs"?




All OO collaborations are done by navigating relationship paths among objects (i.e., the associations in a UML Class Model). There are four basic ways to implement a relationship: embedding an object in the implementation of another object; employing a referential pointer; passing an object reference as a message argument; and using an RDB-style search of instances by explicit identifier. The first is used when only the container needs access to the object. The second is the most common (e.g., it is the default for languages like Java). The third is used for temporary relationships. The last is very rarely used in OO applications.

Relationships are always implemented and navigated when addressing collaboration messages. Here the relationships that are relevant to solving the problem will naturally end up with all paths going to the same object instance because only one is created and the relationships are instantiated after that. This is very basic OOA/D. I would suggest getting any standard text on OOA/D for a detailed description.

By modifying the context I meant that one defines the solution flow of control differently so that the instantiation can be done in one place rather than in several (or within an iteration). Thus one moves the context within the solution for where one needs to instantiate the object. In one of the examples, that means moving the instantiation of ErrorFile from multiple subsystems to a single location in main().



And how would one in such example pass that instance to other interested participants (subfunctions, objects)?


The most common way to implement relationships is via a reference attribute. Such attributes are usually initialized by the constructor or a factory object when the object in hand is created. So one would pass the reference to the constructor/factory object.

You mean - via parameters?


Only in the lazy load variant. Singleton could be unconditionally instantiated as a global, and simply retrieved via GetInstace. Always.


But that's not the way Singleton works. There are actually two objects involved in Singleton. GetInstance is a method of the Singleton object, which is not the object that the client actually collaborates with through the reference. To access that method there must already be a relationship that can be navigated by the client to get to the Singleton object to invoke GetInstance. (Some poorly formed OOPLs like C++ allow one to create global references directly for cases like this but most don't.)

The Singleton.GetInstance method then returns a reference to the object that the client actually collaborates with. The Singleton instantiates that collaboration object as needed. The reference that the Singleton returns instantiates a temporary relationship to the collaboration object. So what one has is:

0..*
[Client] ----------------------------+
| * |
| |
| R1 |
| |
| gets reference from |
| 1 |
[Singleton] | R3
| 1 |
| |
| R2 |
| |
| creates |
| 0..1 0..1 accesses |
[Collaborator] ----------------------+

When Singleton passes back the reference to Client it is temporarily instantiating the R3 relationship between Client and the Collaborator that Client navigates when it talks to Collaborator (i.e., allows Client to address messages to the right Collaborator).

In GoF there is only one class (Singleton) and it has only one instance (object). If you are referring to one class keeping only one instance of the other class - then it is not (GoF) Singleton.

Bottom line: when Client accesses Singleton.GetInstance, it does that via exactly the same sort of relationship navigation that it would employ to access any factory object. IOW, there must be a relationship path to Singleton just as there must be a relationship path to a factory object.

It does that via static method. That's not actually relationship navigation. It is invoking global method. This is what bothers me with singleton.
Is it possible that you consider singleton to be something else?

Note that most factory objects are only instantiated once, typically during subsystem initialization. When one instantiates instances of the clients that invoke the factory objects, one also instantiates a relationship between them and the factory object (usually a pointer reference). One does exactly the same thing with a Singleton.

Also note that the GoF themselves tacitly view Singleton as a factory object because they group it with the other factory patterns in the Creation Patterns category.

In GoF in the responsibility section, following is stated for the singleton:
"may be responsible for creating its own unique instance."
Notice the word may (not is).

Consider also the end of the 2nd point in implementation paragraph:
"No longer is the Singleton class responsible for creating the singleton. Instead, its primary responsibility is to make the singleton object of choice accessible in the system. "

So while I can agree that substituting with different factory might work with tweaking, I don't find it very elegant.




I think it is quite elegant from the perspective of maintainability. The change is entirely encapsulated and completely transparent to any existing clients. All one is doing is changing the /implementation/ of the Singleton object.



Consider following: you have bunch of code relying on the singleton. Now due to new requirements you realize that newly written code must use different (but only one) instance. It basically means you have two sorts of code. How will you replace singleton with factory? How will the factory know which instance to return?


I don't think any code changes outside of the implementation of Singleton itself. The client invokes Singleton.GetInstance and gets a reference back that it collaborates with. If you change the implementation of that Singleton object so it becomes a simple factory object that creates a new instance each time, it isn't a Singleton pattern anymore. Instead it's something like Concrete Creator in the Factory Method pattern. However, that is completely transparent to the client because the Client still invokes the same GetInstance method, it still gets the right instance of the collaboration object, and it still collaborates with that object in the same ways. IOW, the Client is completely unaffected by the change.

This is trivial case where you modify the GetInstance to always create new instance. My question is what if you need to move from one instance to two instance? What if you have bunch of code invoking GetInstance() and now need to separate that code to use two (and no more instances)? How will you refactor that?

That problem can actually be solved without Singleton but there are other contexts that are more difficult. For example, in a networking situation one may need to send multiple message packets for a single message. Typically one wants to open/close one channel around the message and use it for each message packet. If the most common mode is for single packet messages, one might try to open a channel at the packet level. Singleton then provides a performance optimization by simply returning the existing instance handle when there is more than one packet.


I don't understand - why wouldn't one always open the channel at the message level?


Because many network protocols limit the size of the message so it has to be split up into separate packets. Those packets will be sent rapidly so one can avoid the overhead of channel allocation if one sends them all over the same channel. Also, depending on the configuration, using the same channel may make it easier to keep track of what packets need to be reassembled into a single message on the other end.


I still don't understand, why can't you open the channel at the message level. The packets are parts of message, hence all packets will use the same channel.




That is exactly my point. One does not want to open/close channels around packets. One wants to do it at the level of messages. But breaking up a message into packets may be done in a different place (maybe even a different subsystem) that doesn't understand channels. So one has to deal with channels in the context where they are relevant, which is dispatching packets to the network.



So why not form MessageContext class or something which encapsulates Packet?

Or - why not do following:
a) packets[] = BreakMessage()
b) channel = CreateChannel()
c) for each packet in packets
SendPacket(packet, channel)
?


My point is that the context for doing (a) may be in one part of the application while the context for doing (b) and (c) may be in another part of the application, possibly a different subsystem where CreateChannel and SendPacket process one packet at a time.

I can agree with that if we are talking either about legacy system which is hard to refactor or 3rd party systems to which code we have no access.

That sort of separation of concerns is not uncommon. BreakMessage may need to understand the content of a Message to break it up properly. That is at a higher level of abstraction than the rules and policies of shipping packets off to the network. In the latter context one does not need to know anything about the content of the packet but one needs to know a lot about channels. To make the application robust one needs to decouple those concerns as much as possible.

The sending logic needs to know about the content and about the channels (since it sends bits via channel(s)). It might not need to know about the packets, if we treat them as higher level abstractions, still those packets get to be transferred to raw bits.

Question: how do you think responsibilities should be divided? What design leads to justification of singleton?

There is an analogy to database transactions. In the UI of the application that uses the DB, there is no need to know anything about DB transactions. However, the developer always finds some mapping that is relevant (e.g., the user hitting Save in a particular window triggers the close of an open DB transaction). The UI subsystem simply announces that the Save button has been hit. In the subsystem that understands RDBs that announcement is mapped into a closeTransaction action of some sort.



I cannot relate to that analogy. Sorry. IMHO Save button can only relate to Open/Close pair. Not to close.


I'm afraid I don't understand your point. DB transactions span multiple DB update queries. The user is supplying data to construct those queries through a sequence of discrete activities in the UI that have a beginning and end. In the UI there will have to be some unique user action that signals the beginning of such a sequence that maps to opening a DB transaction. That is, there /must/ be separate actions in the UI that define the beginning and end of the sequence that correspond to opening and closing DB transactions.

AFAIK, it is a bad idea that one UI element triggers opening of the transaction, while the other triggers closing (if we talk about DB transaction). The DB statements inside the transactions lock rows. What if user clicks New (which opens the transaction) then issues some queries (which lock the data) and then goes on a coffee break before she clicks Save? Touched rows are locked to everyone else.

I find it better to have single UI element both opening and closing transaction.

Yes. Strategy should be stateless in most situations. It is a delegation of behaviors. It gets its data from the Context object.


What if Strategy subclass uses member variables to keep its internal state which is irrelevant to the outer world (some accumulating data etc)?


Then you would need to have multiple instantiations. My point is just that that is rather unusual. The much more common situation is something like an Employee object where one needs different strategies for computing the benefits in a payroll system. Each strategy will eat exactly the same data, which is all contained in Employee.



Consider following:
You make the framework which relies on stateless strategies. You publish it. How will you enforce the users to avoid member variables (other than through documentaion)?


I'm afraid don't understand where you are going with this. One utilizes the Strategy pattern to resolve very specific dynamic relationship problems. So the implementation of a specific application of the Strategy pattern will be highly tailored to a very specific delegation context. So I don't follow where frameworks and other large scale infrastructures enter the picture.

My point is: you create some library. You publish it in binary. You have some logic using the strategy pattern and enable the clients to register their own concrete strategy.
What if your code relies on single instance of each strategy?
What if some client writes concrete strategy which relies on internal member variables?

The stuff doesn't work correctly beacuse client of your code used member variables.

My point generalized - what each strategy will do is really not known. The only thing one can know about the strategy is that it will implement the interface.


The interface for a particular Strategy pattern implementation is defined in the <polymorphic> interface of the Strategy object.

But the implementation is not - hence you cannot predict what the developer of the concrete strategy will do. If the moon is full she might even use member variables :-D

In many situations instantiating a [Strategy] object every time it is needed would be inefficient because of the overhead of heap operations.


And in many situations it is no issue at all.
I do think though, that instantiating it always is simpler approach of the two and would devise some way of resolving performance issues only when they arise.


You don't do a lot of R-T/E, do you? B-)) In that world counting cycles is often critical.



No I don't :-(
But I do speak from my (very limited) experience.

However, from an aesthetic viewpoint I would argue that professionalism requires at least evaluating the alternatives. If it is too much



Agreed.

trouble for the benefit, then fine. In the case of Strategy (when one doesn't /need/ multiple instances), though, I would bet that in most



Refer to the limitation of each strategy keeping its internal state through member variables.


IME the vast majority of situations where one applies the Strategy pattern will naturally have all state variables in the Context object.

The Context can deal only with data common to each strategy. Not with the specifics of each concrete implementation.

cases the total number of executable statements will be the same or fewer if one does only one instance. The performance gain comes from /where/ the statements are, not changing them. So there is no justification for not doing it right to improve performance (however small the gain may be).



It imposes some limitations on the clients (concrete strategies), which cannot be enforced by the client logic (strategy client), and it makes the provider of strategy implementation somewhat more complicated so to me the question really amounts to- is it justified?


How? The way the GoF implement the pattern delegation the Client doesn't need to know that the Strategy objects even exist. That is a private matter between Context and Strategy. The DbC contract between the Client and the Context is exactly the same as it would have been without delegation.

As said - whatever contract there is it is between the client and strategy interface. The specific implementations might have additional needs. Such as using member variables :-)

<aside>
We have a generation of developers today that has lived with Moore's Law so long that they believe all performance problems are solved by getting a bigger and better computer. As a result they don't even know how to



I won't say I'm one of those, but I do get slightly irritated (sorry for using harsh words) when people are using statements like "We might have performance problems there" (which then leads to devising complex schemes of avoiding virtual performance problems). IME these statements always came down to "we never had performance issues there" - not because of the premature optimizations, but simply because performance issue was never there. Whatever happened to that "cut once measure twice" thing?

think about performance issues. Even Moore doesn't expect that Law to hold forever and when it hits an inflection point there is going to be a lot of gnashing of teeth as developers re-learn the hard-won lessons of the past.

In fact, most performance optimizations at the 3GL level are mostly a matter of proper style and mindset. IOW, there are almost idiomatic. Usually it doesn't take significantly more time and effort to optimize.



IMHO almost every optimization will complicate things if only a little bit. The time required in initial development may not be significant, but keep in mind that developers revisit that code and have to deal with greater complexity (however small).


If the tactical optimizations don't require any more code, as in the Strategy case, then why would there be problems subsequently? I would

How does it not require more code? One the one hand you have:
return new ConcreteStrategy(...)
on the other you need to have something more. At least an if statement and a corresponding member variable.

argue that by dealing with the instantiation properly one would actually make it easier for maintenance in the future because the instantiation will be better encapsulated.

IMHO, more performant approach is a priori better if LOC number is the same and it doesn't add any additional complexity such as unwanted side effects.

Note that one reason we use Singleton and the rest of the creation patterns is because we want to isolate and decouple instantiation issues from collaboration issues. The rules and policies for participation in collaborations are quite often different than the rules and policies governing the collaboration itself.

IMHO the explosion of Singleton::GetInstance() around the code is actually coupling. Every single module (method, class etc.) containing this line of code is dependant (coupled) on the singleton pattern.

IMHO, given modern computer and prices, one should define the minimal configuration and acceptable performance expectations, monitor performance constantly (ideally via automated tests) and correct the issues where performance is not satisfactory.


That approach does not work well in practice IME. One problem is that performance problems often don't show up except under heavy load situations. Those are difficult to setup and tend to require a lot of time to execute, which precludes continuous performance monitoring. For example, proper load testing sometimes requires throwing randomized inputs at the software for tens or even hundreds.

While the real life cannot be always simulated, I still find the simulation (via tests) the best approach. What are the alternatives? Optimizing (which usually means complicating) different parts of code without any proof, in pure hope and faith that they will have observable effect on overall performance?

A more insidious problem that that failing to tactically write high performance code routinely is a cumulative problem that spans the entire code base. By the time one realizes that a performance problem is imminent, one is faced with a massive refactoring of all the code.

Performance usually falls in descrete parts of code (bottlenecks). With proper design, responsibilities divisions, encapsulation, decoupling, etc. one should be able to optimize by using local interventions (i.e. not all around the code).

In contrast, if one uses singletons around and wants to switch to two instances, one might face all round refactoring (or some fix via global bool and localized if-else).

The sorts of strategic design decisions that can create performance problems usually require substantial structural rewrites after the fact; essentially one has to completely rewrite the relevant code. So one is usually much better off doing some up-front analysis and then relying on

What up-front analysis is better than real measures?

prototyping to resolve any worrisome situations. IOW, one measures twice so that one only needs to cut once. B-)

I can agree to prototyping to measure performance if it means write the quick and dirty code to measure performance of some approach.

However if it amounts to theoretical discussions without real measures - it is without any reasoning.

Consider your own statement:
"... instantiating a [Strategy] object every time it is needed would be inefficient because of the overhead of heap operations."

What is the outer context? What is the percentage of time heap operation takes in the outer context? Can single instantiation really lead to observable performance effects?

The question still remains - when does one need Singleton?




I think one needs it when:

(A) there must be only one instance AND

(B) there is no feasible way in the solution to reduce the number of opportunities to instantiate an instance to 1. Specifically, there is no reasonable way to avoid:

(B1) Instantiating the object in different parts of the solution and/or

(B2) Instantiating the object within an iteration of some kind.



See above.


I'm, not sure what you are referring to.

That's because you snipped my statement:

"Fair enough, but I'm still seeking specific examples. Performance might be one - but I would probably go and resolve it without singleton. What is(are) the specific case(s) where singleton is justified?"

Sasa
.



Relevant Pages

  • Re: Diff b/w factory pattern and strategy pattern
    ... Generally instantiation and collaboration involve different rules and policies so that alone is sufficient to justify different patterns. ... Strategy is about an ongoing selection of behavior based upon run-time context. ...
    (comp.object)
  • Re: Design choices and patterns for passing contextual runtime information.
    ... pass context to different nodes in the "parse tree". ... The reason is that the rules and policies that govern instantiation of objects and relationships in the problem space are quite often distinct from the rules and policies that govern collaboration. ... Since everyone accesses the same C, all we need to ensure is that there is a viable relationship path for each object to navigate to get to the Context: ...
    (comp.lang.java.programmer)
  • Re: Design choices and patterns for passing contextual runtime information.
    ... pass context to different nodes in the "parse tree". ... context information is an even more general problem than this parser, ... from the rules and policies that govern collaboration. ... instantiation is about Who participates in collaborations while the ...
    (comp.lang.java.programmer)
  • Re: C++ design question
    ... >>and policies for object and relationship instantiation (object ... way to do that in OOA/D is via separation of concerns and encapsulation. ... >>conditional one can't implement it with a reference. ... >>policies ensures that the developer thinks about them. ...
    (comp.object)
  • Re: Attaching a behavior only to multiple classes.
    ... responsibilities are important to the problem, ... specific solution context. ... The point is that GoF uses delegation to separate the behavior from ... >>concerns of instantiation, implementation, and navigation. ...
    (comp.object)