This is not another post about how to solve the double checked locking idiom. The aim here is to understand what could go wrong without synchronization.
One of the most important promises of the Java Memory Model (JMM) is:
If a program is correctly synchronized, then all executions of the program will appear to be sequentially consistent.
This is an extremely strong guarantee for programmers. Programmers do not need to reason about reorderings to determine that their code contains data races. Therefore they do not need to reason about reorderings when determining whether their code is correctly synchronized. Once the determination that the code is correctly synchronized is made, the programmer does not need to worry that reorderings will affect his or her code.
The following code, taken from Java Concurrency in Practice, is not thread safe because accesses to the shared resource
variable are not properly synchronized:
public class UnsafeLazyInitialization { private static Resource resource; public static Resource getInstance() { if (resource == null) //1 resource = new Resource(); //2 return resource; //3 } }
I will detail below some of the problems that could happen when running that innocent piece of code in a multi-threaded environment. In particular:
-
resource
could be instantiated more than once -
getInstance
could return an object in an inconsistent state - more interestingly,
getInstance
could return null
Multiple instantiation
This is the most obvious issue – the code uses a check-then-act pattern so it is very possible that two threads could arrive at the same time in the method, both see resource
as null and both initialise the variable. We how have 2 instances of what was supposed to be a singleton.
Improperly constructed object
This one is less obvious – line 2 looks like it is atomic but it isn’t as the JVM needs to (among other things):
- allocate some memory
- create the new object
- initialise its fields with their default value (false for boolean, 0 for other primitives, null for objects)
- run the constructor, which includes running parent constructors too
- assign the reference to the newly constructed object to
resource
Because there is no synchronization, the JMM allows a JVM to perform these steps in virtually any order. See for example this famous discussion about double checked locking that shows that some JIT compilers do run step 5 before step 4.
So getInstance
could return a reference to a non-null but inconsistent object (with un-initialised fields).
getInstance
can return null
This is even less obvious. It is difficult to imagine an execution path that could return null with such a simple code. However the JMM allows it. To understand why this is possible, we need to analyse the reads and writes in details and assess whether there is a happens-before relationship between them. The code can be rewritten as follows to clearly show the reads and writes:
Thread 0 --------------------------------------------------------------------- 10: resource = null; //default value //write ===================================================================== Thread 1 | Thread 2 ----------------------------------+---------------------------------- 11: a = resource; | 21: x = resource; //read 12: if (a == null) | 22: if (x == null) 13: resource = new Resource(); | 23: resource = new Resource(); //write 14: b = resource; | 24: y = resource; //read 15: return b; | 25: return y;
The JLS #17.4.5 gives the rules for a read to be allowed to observe a write:
We say that a read r of a variable v is allowed to observe a write w to v if, in the happens-before partial order of the execution trace:
- r is not ordered before w (i.e., it is not the case that hb(r, w)), and
- there is no intervening write w’ to v (i.e. no write w’ to v such that hb(w, w’) and hb(w’, r)).
In this example, both 21 and 24 are therefore allowed to observe 10 or 13 and a legal execution of the program is (assuming thread 1 sees resource null and initialises it):
- 21: x = not null (reads the write line 13)
- 22: false
- 24: y = null (reads the write line 10)
- 25: return null
Instructions reordering
In practice, T2 is not going to see a null value after having seen a non-null value, but either the compiler or the JVM or the JIT can reorder instructions in a way that will produce a similar execution. For eaxmple, a possible reordering (with a theoretical execution) would be:
public class UnsafeLazyInitialization { private static Resource resource; public static Resource getInstance() { Resource temp = resource; //null in T1 and T2 if (resource == null) //null in T1 but not in T2 because it has been initialised by T1 in the meantime resource = temp = new Resource(); //only executed by T1 return temp; //T1 returns the new value, T2 returns null } }
This reordering, although it makes little sense, is perfectly valid because it does not affect the intra-thread semantics (if run in a single-threaded environment, it will produce the same result as the original code).
Conclusion
This example shows that even on a fairly contrived example the outcome of an improperly synchronized program can be quite surprising. Although it is unlikely that any compilers would actually perform that reordering, a more complex situation could quickly become impossible to analyse.
Bottom line: saving on synchronization is not an option.