Compressed strings (Java 6) and compact strings (Java 9) both have the same motivation (strings are often effectively Latin-1, so half the space is wasted) and goal (make those strings small) but the implementations differ a lot.
Compressed Strings
In an interview Aleksey Shipil?v (who was in charge of implementing the Java 9 feature) had this to say about compressed strings:
UseCompressedStrings feature was rather conservative: while distinguishing between char[]
and byte[]
case, and trying to compress the char[]
into byte[]
on String
construction, it done most String
operations on char[]
, which required to unpack the String.
Therefore, it benefited only a special type of workloads, where most strings are compressible (so compression does not go to waste), and only a limited amount of known String
operations are performed on them (so no unpacking is needed). In great many workloads, enabling -XX:+UseCompressedStrings
was a pessimization.
[...] UseCompressedStrings implementation was basically an optional feature that maintained a completely distinct String
implementation in alt-rt.jar
, which was loaded once the VM option is supplied. Optional features are harder to test, since they double the number of option combinations to try.
Compact Strings
In Java 9 on the other hand, compact strings are fully integrated into the JDK source. String
is always backed by byte[]
, where characters use one byte if they are Latin-1 and otherwise two. Most operations do a check to see which is the case, e.g. charAt
:
public char charAt(int index) {
if (isLatin1()) {
return StringLatin1.charAt(value, index);
} else {
return StringUTF16.charAt(value, index);
}
}
Compact strings are enabled by default and can be partially disabled - "partially" because they are still backed by a byte[]
and operations returning char
s must still put them together from two separate bytes (due to intrinsics it is hard to say whether this has a performance impact).
More
If you're interested in more background on compact strings I recommend to read the interview I linked to above and/or watch this great talk by the same Aleksey Shipil?v (which also explains the new string concatenation).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…