Why must it work?
The JLS 5, Section 15.18.1.1 JLS 8 § 15.18.1 "String Concatenation Operator +", leading to JLS 8, § 5.1.11 "String Conversion", requires this operation to succeed without failure:
...Now only reference values need to be considered. If the reference is null, it is converted to the string "null" (four ASCII characters n, u, l, l). Otherwise, the conversion is performed as if by an invocation of the toString method of the referenced object with no arguments; but if the result of invoking the toString method is null, then the string "null" is used instead.
How does it work?
Let's look at the bytecode! The compiler takes your code:
String s = null;
s = s + "hello";
System.out.println(s); // prints "nullhello"
and compiles it into bytecode as if you had instead written this:
String s = null;
s = new StringBuilder(String.valueOf(s)).append("hello").toString();
System.out.println(s); // prints "nullhello"
(You can do so yourself by using javap -c
)
The append methods of StringBuilder
all handle null just fine. In this case because null
is the first argument, String.valueOf()
is invoked instead since StringBuilder does not have a constructor that takes any arbitrary reference type.
If you were to have done s = "hello" + s
instead, the equivalent code would be:
s = new StringBuilder("hello").append(s).toString();
where in this case the append method takes the null and then delegates it to String.valueOf()
.
Note: String concatenation is actually one of the rare places where the compiler gets to decide which optimization(s) to perform. As such, the "exact equivalent" code may differ from compiler to compiler. This optimization is allowed by JLS, Section 15.18.1.2:
To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.
The compiler I used to determine the "equivalent code" above was Eclipse's compiler, ecj.