Re: Singletons



Responding to Sasa...

And how would one in such example pass that instance to other interested participants (subfunctions, objects)?

The most common way to implement relationships is via a reference attribute. Such attributes are usually initialized by the constructor or a factory object when the object in hand is created. So one would pass the reference to the constructor/factory object.


You mean - via parameters?

Normally the only parameter would be to a constructor or factory object call. The relationship itself would be instantiated by the constructor or factory object assigning that reference to a referential attribute in the object being instantiated. Whenever that object subsequently needed to collaborate, it would address its messages to the referential attribute.

Only in the lazy load variant. Singleton could be unconditionally instantiated as a global, and simply retrieved via GetInstace. Always.

But that's not the way Singleton works. There are actually two objects involved in Singleton. GetInstance is a method of the Singleton object, which is not the object that the client actually collaborates with through the reference. To access that method there must already be a relationship that can be navigated by the client to get to the Singleton object to invoke GetInstance. (Some poorly formed OOPLs like C++ allow one to create global references directly for cases like this but most don't.)

The Singleton.GetInstance method then returns a reference to the object that the client actually collaborates with. The Singleton instantiates that collaboration object as needed. The reference that the Singleton returns instantiates a temporary relationship to the collaboration object. So what one has is:

0..*
[Client] ----------------------------+
| * |
| |
| R1 |
| |
| gets reference from |
| 1 |
[Singleton] | R3
| 1 |
| |
| R2 |
| |
| creates |
| 0..1 0..1 accesses |
[Collaborator] ----------------------+

When Singleton passes back the reference to Client it is temporarily instantiating the R3 relationship between Client and the Collaborator that Client navigates when it talks to Collaborator (i.e., allows Client to address messages to the right Collaborator).


In GoF there is only one class (Singleton) and it has only one instance (object). If you are referring to one class keeping only one instance of the other class - then it is not (GoF) Singleton.

They are using an implementation mechanism -- a static class method -- that may not be available in all OOPLs. If one were to model it in a language-independent manner in the OOA/D, that model would look like mine.

Bottom line: when Client accesses Singleton.GetInstance, it does that via exactly the same sort of relationship navigation that it would employ to access any factory object. IOW, there must be a relationship path to Singleton just as there must be a relationship path to a factory object.


It does that via static method. That's not actually relationship navigation. It is invoking global method. This is what bothers me with singleton.
Is it possible that you consider singleton to be something else?

Not exactly. I agree with you that static class methods are overused during OOP. My problem is with GoF for choosing to explicitly tie their pattern to a specific OOPL implementation mechanism that may not be portable across languages. They could have expressed Singleton with an OOA/D model like mine above that would be language-independent. Then one could implement it in a specific OOPL tactically using a static method and a single class. That would separate the OOA/D conceptual issues from the OOP tactical issues.

[The GoF did a great job in essentially opening a whole new basis of communication and practice in OO development. But they didn't get all the details exactly right out of the box. This mixing of OOA/D and OOP in the Singleton pattern definition is one example. The way they /implement/ delegation by representing the delegated object's properties in the interface of the Context object is another problem. But these are usually rather benign problems that usually don't show up except in discussions like these.]

Note that most factory objects are only instantiated once, typically during subsystem initialization. When one instantiates instances of the clients that invoke the factory objects, one also instantiates a relationship between them and the factory object (usually a pointer reference). One does exactly the same thing with a Singleton.

Also note that the GoF themselves tacitly view Singleton as a factory object because they group it with the other factory patterns in the Creation Patterns category.


In GoF in the responsibility section, following is stated for the singleton:
"may be responsible for creating its own unique instance."
Notice the word may (not is).

Yes. This seems curious given their description under Participants. This sounds more like my OOA/D view where they are conceptually different objects.


Consider also the end of the 2nd point in implementation paragraph:
"No longer is the Singleton class responsible for creating the singleton. Instead, its primary responsibility is to make the singleton object of choice accessible in the system. "

Yes, this is probably why they chose to tie Singleton to static methods. I agree it strongly suggests they were thinking primarily in terms of providing a global reference rather than enforcing a rule on instantiation. If so, I don't like that at all because globals are antithetical to the OO paradigm.

Consider following: you have bunch of code relying on the singleton. Now due to new requirements you realize that newly written code must use different (but only one) instance. It basically means you have two sorts of code. How will you replace singleton with factory? How will the factory know which instance to return?

I don't think any code changes outside of the implementation of Singleton itself. The client invokes Singleton.GetInstance and gets a reference back that it collaborates with. If you change the implementation of that Singleton object so it becomes a simple factory object that creates a new instance each time, it isn't a Singleton pattern anymore. Instead it's something like Concrete Creator in the Factory Method pattern. However, that is completely transparent to the client because the Client still invokes the same GetInstance method, it still gets the right instance of the collaboration object, and it still collaborates with that object in the same ways. IOW, the Client is completely unaffected by the change.


This is trivial case where you modify the GetInstance to always create new instance. My question is what if you need to move from one instance to two instance? What if you have bunch of code invoking GetInstance() and now need to separate that code to use two (and no more instances)? How will you refactor that?

That code will still all be in the Singleton implementation, hidden from the client. In fact, it would all be encapsulated in the implementation of the Singleton::Instance method.

[In practice it might not be depending on how the context was defined. With two instances there must be some way to determine which one the client should get (assuming there are more than two clients). If that context rule is defined elsewhere that the client (e.g., an attribute in some other object), Singleton can go get it and make the right choice transparently to the client.

But if the client logically knows which one should be chosen, it will have to tell Singleton that with a parameter and that will change the interface. But then I would argue one is solving a different problem.]

That is exactly my point. One does not want to open/close channels around packets. One wants to do it at the level of messages. But breaking up a message into packets may be done in a different place (maybe even a different subsystem) that doesn't understand channels. So one has to deal with channels in the context where they are relevant, which is dispatching packets to the network.

So why not form MessageContext class or something which encapsulates Packet?

Or - why not do following:
a) packets[] = BreakMessage()
b) channel = CreateChannel()
c) for each packet in packets
SendPacket(packet, channel)
?

My point is that the context for doing (a) may be in one part of the application while the context for doing (b) and (c) may be in another part of the application, possibly a different subsystem where CreateChannel and SendPacket process one packet at a time.


I can agree with that if we are talking either about legacy system which is hard to refactor or 3rd party systems to which code we have no access.

I think it is a potential problem in any system that is properly partitioned. One deliberately uses subsystems to isolate and encapsulate different concerns and different levels of abstraction. If breaking up the message into packets depends on knowledge of the message content, then I would normally expect that to be done in a different subsystem because manipulating content is a quite different subject matter at a higher level of abstraction than manipulating packets when writing to network hardware.


That sort of separation of concerns is not uncommon. BreakMessage may need to understand the content of a Message to break it up properly. That is at a higher level of abstraction than the rules and policies of shipping packets off to the network. In the latter context one does not need to know anything about the content of the packet but one needs to know a lot about channels. To make the application robust one needs to decouple those concerns as much as possible.


The sending logic needs to know about the content and about the channels (since it sends bits via channel(s)). It might not need to know about the packets, if we treat them as higher level abstractions, still those packets get to be transferred to raw bits.

I have to disagree. If the sending logic is to be reusable (which is highly desirable for things like a network interface) it needs to be independent of content. That is, it should view the packet as a simply set of contiguous bytes it must write to the hardware. At that level it should really know nothing about messages so the subsystem interface would be at the level of sendPacket(packet ID, byte count, buffer address).

In our example, to optimize channel resources it also needs to be told when it can free channels, so the sendPacket method might have a bool argument to indicate whether the packet in hand is the last of a series. But it doesn't really need to understand the semantics of "last packet"; it just needs to do the right thing with the boolean value. Thus the subsystem breaking up messages thinks about "last Packet" but the network interface thinks about "free the channel".


Question: how do you think responsibilities should be divided? What design leads to justification of singleton?

In the situation immediately above, the network interface needs to obtain a channel for each packet when one isn't available. But it should only free an available channel when told it is OK to do so. One way to do that is by (C++):

sendPacket (int ID, int bcnt, ByteArray* buffer, bool free)
{
channel = Channel::Instance(); // singleton defined elsewhere

// write the packet buffer to the channel with appropriate
// formatting, handshaking, etc.

if (free)
Channel::Free(); // deletes the instance
}

[Obviously this doesn't work if multiple clients are breaking up messages and invoking the interface concurrently. But one can deal with that by providing additional identity information. I'm just keeping the example simple.]


There is an analogy to database transactions. In the UI of the application that uses the DB, there is no need to know anything about DB transactions. However, the developer always finds some mapping that is relevant (e.g., the user hitting Save in a particular window triggers the close of an open DB transaction). The UI subsystem simply announces that the Save button has been hit. In the subsystem that understands RDBs that announcement is mapped into a closeTransaction action of some sort.

I cannot relate to that analogy. Sorry. IMHO Save button can only relate to Open/Close pair. Not to close.

I'm afraid I don't understand your point. DB transactions span multiple DB update queries. The user is supplying data to construct those queries through a sequence of discrete activities in the UI that have a beginning and end. In the UI there will have to be some unique user action that signals the beginning of such a sequence that maps to opening a DB transaction. That is, there /must/ be separate actions in the UI that define the beginning and end of the sequence that correspond to opening and closing DB transactions.


AFAIK, it is a bad idea that one UI element triggers opening of the transaction, while the other triggers closing (if we talk about DB transaction). The DB statements inside the transactions lock rows. What if user clicks New (which opens the transaction) then issues some queries (which lock the data) and then goes on a coffee break before she clicks Save? Touched rows are locked to everyone else.

I find it better to have single UI element both opening and closing transaction.

OK, I see where you are now. I do not see this as a UI problem. If that sort of lockup is a concern, then one needs two distinct processing sequences in the application. The data the user provides via the UI must be batched in memory until the user is done entering it. When that is complete, then the application needs to construct and issue the various queries from the batched data. The transaction will be opened and closed around the second sequence of activities, not the user's UI entries.

Whatever in the UI indicated start/end for data entry now maps just to the start and end of batching the data in memory. However, the UI end now also maps to the start of the activities for generating queries. The start/end process around generating queries from the batched data now maps to opening and closing a DB transaction. But one still has unique start/end conditions that map to DB transaction open/close.

Typically, this sort of optimization to alleviate DB resource allocation problems will be done wholly in a subsystem that accesses the RDB. IOW, one effectively provides a write cache. However, the main application logic could batch the data while the DB access subsystem generates the queries. But I would not expect the UI to have anything to do with the DB transactions in this scenario.


Yes. Strategy should be stateless in most situations. It is a delegation of behaviors. It gets its data from the Context object.

What if Strategy subclass uses member variables to keep its internal state which is irrelevant to the outer world (some accumulating data etc)?

Then you would need to have multiple instantiations. My point is just that that is rather unusual. The much more common situation is something like an Employee object where one needs different strategies for computing the benefits in a payroll system. Each strategy will eat exactly the same data, which is all contained in Employee.

Consider following:
You make the framework which relies on stateless strategies. You publish it. How will you enforce the users to avoid member variables (other than through documentaion)?

I'm afraid don't understand where you are going with this. One utilizes the Strategy pattern to resolve very specific dynamic relationship problems. So the implementation of a specific application of the Strategy pattern will be highly tailored to a very specific delegation context. So I don't follow where frameworks and other large scale infrastructures enter the picture.


My point is: you create some library. You publish it in binary. You have some logic using the strategy pattern and enable the clients to register their own concrete strategy.
What if your code relies on single instance of each strategy?
What if some client writes concrete strategy which relies on internal member variables?

I still don't understand this. What is going in the library?

A library implies reuse in multiple contexts but one usually implements a Strategy pattern to solve a particular problem. That is, while the pattern is generic, the implementation of the pattern is unique to the problem in hand.

My point generalized - what each strategy will do is really not known. The only thing one can know about the strategy is that it will implement the interface.

The interface for a particular Strategy pattern implementation is defined in the <polymorphic> interface of the Strategy object.


But the implementation is not - hence you cannot predict what the developer of the concrete strategy will do. If the moon is full she might even use member variables :-D

True. But the developer is writing the implementation of the pattern to the specific problem in hand. That includes the interface, the DbC contracts, and every Strategy subclass. It is all tailored to the specific Client/Context situation.

I am getting the impression from this sub-discussion that you see design patterns as something that can be implemented in a reusable fashion and put in a class library. If so, I don't think that is the case. It is in the nature of patterns that when they are actually applied they must be tailored specifically to the context. IOW, their implementations are not reusable.


In many situations instantiating a [Strategy] object every time it is needed would be inefficient because of the overhead of heap operations.

<snip>

trouble for the benefit, then fine. In the case of Strategy (when one doesn't /need/ multiple instances), though, I would bet that in most

Refer to the limitation of each strategy keeping its internal state through member variables.

IME the vast majority of situations where one applies the Strategy pattern will naturally have all state variables in the Context object.


The Context can deal only with data common to each strategy. Not with the specifics of each concrete implementation.

Are you talking about local (stack) variables or state variables (attributes persistent between Strategy method invocations)?

I am talking about state variables. IME, the Strategies just provide different behaviors that eat the same state variables and the state variables are usually owned by the Context object. For example,

* computes for R1 1
[Asset] ----------------------- [DepreciationStrategy]
+ baseValue A
+ totalPeriods |
+ currentPeriod +-------------+------....
| |
[Linear] [DoubleDeclining]

For each [DepreciationStrategy] subclass a different formula is used to compute depreciation. But each formula computes using the same basic data that is intrinsic to [Asset].

[This sort of parametric polymorphism is a very useful -- but sadly underplayed -- tool for OO development. My blog has a category on Invariants and parametric polymorphism with a number of examples. One of which happens to take Depreciation a step further and eliminates the need for Strategy at all.]


cases the total number of executable statements will be the same or fewer if one does only one instance. The performance gain comes from /where/ the statements are, not changing them. So there is no justification for not doing it right to improve performance (however small the gain may be).

It imposes some limitations on the clients (concrete strategies), which cannot be enforced by the client logic (strategy client), and it makes the provider of strategy implementation somewhat more complicated so to me the question really amounts to- is it justified?

How? The way the GoF implement the pattern delegation the Client doesn't need to know that the Strategy objects even exist. That is a private matter between Context and Strategy. The DbC contract between the Client and the Context is exactly the same as it would have been without delegation.


As said - whatever contract there is it is between the client and strategy interface. The specific implementations might have additional needs. Such as using member variables :-)

No matter what data they access and where it lives, the DbC contract between Client and Context remains in force. Whatever the differences in Strategy flavors, they must all obey the DbC contract between Client and Context and that contract is implicit in the DbC contract between Context and the Strategy superclass interface. IOW, LSP must be satisfied.

But this is getting far afield from the performance issues around single vs. multiple instances _when they don't need member state variables_. (Recall that I said that when they /do/ need member state variables, one does need multiple instances.)


<aside>
We have a generation of developers today that has lived with Moore's Law so long that they believe all performance problems are solved by getting a bigger and better computer. As a result they don't even know how to

I won't say I'm one of those, but I do get slightly irritated (sorry for using harsh words) when people are using statements like "We might have performance problems there" (which then leads to devising complex schemes of avoiding virtual performance problems). IME these statements always came down to "we never had performance issues there" - not because of the premature optimizations, but simply because performance issue was never there. Whatever happened to that "cut once measure twice" thing?

think about performance issues. Even Moore doesn't expect that Law to hold forever and when it hits an inflection point there is going to be a lot of gnashing of teeth as developers re-learn the hard-won lessons of the past.

In fact, most performance optimizations at the 3GL level are mostly a matter of proper style and mindset. IOW, there are almost idiomatic. Usually it doesn't take significantly more time and effort to optimize.

IMHO almost every optimization will complicate things if only a little bit. The time required in initial development may not be significant, but keep in mind that developers revisit that code and have to deal with greater complexity (however small).



If the tactical optimizations don't require any more code, as in the Strategy case, then why would there be problems subsequently? I would


How does it not require more code? One the one hand you have:
return new ConcreteStrategy(...)
on the other you need to have something more. At least an if statement and a corresponding member variable.

I don't think there is any "something more". I see no need for an IF statement. One always invokes new ConcreteStrategy. The issue is whether you do it in once in one place or many times, possibly in many places.

* R1 1
[Context] -------------------- [Strategy]
A
|
...

The R1 relationship must be instantiated in all cases where a [Context] object is created. When you instantiate a [Context] object you can create a [Strategy] every time and assign its reference to the referential attribute in [Context] or you can simply assign the reference to a [Strategy] that you created once to the referential attribute. The only difference is where you get the [Strategy] reference (e.g., whether new is invoked during subsystem initialization or in the factory method for instantiating a Context).


argue that by dealing with the instantiation properly one would actually make it easier for maintenance in the future because the instantiation will be better encapsulated.


IMHO, more performant approach is a priori better if LOC number is the same and it doesn't add any additional complexity such as unwanted side effects.

Then we seem to be agreed on tactical OOP performance programming.

IMHO, given modern computer and prices, one should define the minimal configuration and acceptable performance expectations, monitor performance constantly (ideally via automated tests) and correct the issues where performance is not satisfactory.

That approach does not work well in practice IME. One problem is that performance problems often don't show up except under heavy load situations. Those are difficult to setup and tend to require a lot of time to execute, which precludes continuous performance monitoring. For example, proper load testing sometimes requires throwing randomized inputs at the software for tens or even hundreds.


While the real life cannot be always simulated, I still find the simulation (via tests) the best approach. What are the alternatives? Optimizing (which usually means complicating) different parts of code without any proof, in pure hope and faith that they will have observable effect on overall performance?

My issue here was simply with the notion of monitoring performance constantly (e.g., with every build). That is often impractical.


A more insidious problem that that failing to tactically write high performance code routinely is a cumulative problem that spans the entire code base. By the time one realizes that a performance problem is imminent, one is faced with a massive refactoring of all the code.


Performance usually falls in descrete parts of code (bottlenecks). With proper design, responsibilities divisions, encapsulation, decoupling, etc. one should be able to optimize by using local interventions (i.e. not all around the code).

As you point out, bottlenecks are usually strategic design issues. Here I am talking about routine OOP code writing. By the time it becomes clear that the there is a cumulative problem, it is usually too late to avoid a massive refactoring.

I believe this sort of problem is pervasive. I think it is a substantial contributor to the bigger-computer-needed problem. It is no accident that most software houses define the minimum configurations needed /after/ the software is done rather than as up front requirements.

The sorts of strategic design decisions that can create performance problems usually require substantial structural rewrites after the fact; essentially one has to completely rewrite the relevant code. So one is usually much better off doing some up-front analysis and then relying on


What up-front analysis is better than real measures?

The analysis identifies potential problem areas. One then measures the problem areas via a prototype before committing to a particular architecture throughout the development.


prototyping to resolve any worrisome situations. IOW, one measures twice so that one only needs to cut once. B-)


I can agree to prototyping to measure performance if it means write the quick and dirty code to measure performance of some approach.

However if it amounts to theoretical discussions without real measures - it is without any reasoning.

As I said, performance analysis and prototyping are a matched set.


Consider your own statement:
"... instantiating a [Strategy] object every time it is needed would be inefficient because of the overhead of heap operations."

What is the outer context? What is the percentage of time heap operation takes in the outer context? Can single instantiation really lead to observable performance effects?

This is a different context. The quote here is about routine tactical coding practices at the 3GL level. Those practices very rarely require significant effort beyond ignoring performance because they are almost idiomatic once one has the correct mindset. Implementing Strategy so the subclasses are instantiated once requires no extra effort or complexity. So all one needs to know is whether the members have state variables or not to select the right approach. And it is easy to change one's mind later (just move the new) if subsequently state variables are required. IOW, it is not that tough to write high performance code routinely.


The question still remains - when does one need Singleton?





I think one needs it when:

(A) there must be only one instance AND

(B) there is no feasible way in the solution to reduce the number of opportunities to instantiate an instance to 1. Specifically, there is no reasonable way to avoid:

(B1) Instantiating the object in different parts of the solution and/or

(B2) Instantiating the object within an iteration of some kind.




See above.



I'm, not sure what you are referring to.


That's because you snipped my statement:

"Fair enough, but I'm still seeking specific examples. Performance might be one - but I would probably go and resolve it without singleton. What is(are) the specific case(s) where singleton is justified?"

I elided it because I had already answered that question immediately below your statement. Performance only comes into the picture in defining the (A) requirement. But once the (A) requirement is established, I think the criteria for using Singleton is what I described.


*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
"Model-Based Translation: The Next Step in Agile Development". Email
info@xxxxxxxxxxxxxxxxx for your copy.
Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH



.



Relevant Pages

  • Re: Singletons
    ... By modifying the context I meant that one defines the solution flow of control differently so that the instantiation can be done in one place rather than in several. ... Singleton is really just a specialized factory object. ... There is no way for the client to know whether the instance returned is one of one or one of many. ... Because many network protocols limit the size of the message so it has to be split up into separate packets. ...
    (comp.object)
  • Re: beginner Qs about COM/.NET (context = exposing a pre-existing app to scripting)
    ... You can even use a C++ singleton class. ... > objects into a script variable, and lets it sit on the shelf, my server ... it is the client that determines when to call Releaseon your object. ... > I think I have to use MTA; while program A is using the serial port, ...
    (microsoft.public.vc.atl)
  • Re: Opinions on the Law Of Demeter
    ... > But the point of combining the factory and singleton was to be able to ... > the client a reference to the factory without the client knowing whether ... >> LoD, ... e.g. connection pooling, we all used to pass around connections, and then ...
    (comp.object)
  • Re: Intermittent Remoting Event Callback Problem
    ... OPCEngine published component that was originally a singleton. ... requested by the client through a single-call class factory. ... to reconnect to the server and re-register the callbacks when an error ... the callback *STILL* fails. ...
    (microsoft.public.dotnet.distributed_apps)
  • Re: confused about __new__
    ... use the class itself as the singleton rather than ever instantiating it. ... A metaclass is the class of another class (which is normally `type` if ... model class and to setup a default manager for the model class. ...
    (comp.lang.python)