The line from the documentation,
If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
indeed dates from before the tree-bin implementation was added in JDK 8 (JEP 180). You can see this text in the JDK 1.6 HashMap documentation. In fact, this text dates all the way back to JDK 1.2 when the Collections Framework (including HashMap) was introduced. You can find unofficial versions of the JDK 1.2 docs around the web, or you can download a version from the archives if you want to see for yourself.
I believe this documentation was correct up until the tree-bin implementation was added. However, as you've observed, there are now cases where it's incorrect. The policy is not only that resizing can occur if the number of entries divided by the load factor exceeds the capacity (really, table length). As you noted, resizes can also occur if the number of entries in a single bucket exceeds TREEIFY_THRESHOLD (currently 8) but the table length is smaller than MIN_TREEIFY_CAPACITY (currently 64).
You can see this decision in the treeifyBin()
method of HashMap.
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
resize();
else if ((e = tab[index = (n - 1) & hash]) != null) {
This point in the code is reached when there are more than TREEIFY_THRESHOLD entries in a single bucket. If the table size is at or above MIN_TREEIFY_CAPACITY, this bin is treeified; otherwise, the table is simply resized.
Note that this can leave bins with rather more entries than TREEIFY_THRESHOLD at small table sizes. This isn't terribly difficult to demonstrate. First, some reflective HashMap-dumping code:
// run with --add-opens java.base/java.util=ALL-UNNAMED
static Class<?> classNode;
static Class<?> classTreeNode;
static Field fieldNodeNext;
static Field fieldHashMapTable;
static void init() throws ReflectiveOperationException {
classNode = Class.forName("java.util.HashMap$Node");
classTreeNode = Class.forName("java.util.HashMap$TreeNode");
fieldNodeNext = classNode.getDeclaredField("next");
fieldNodeNext.setAccessible(true);
fieldHashMapTable = HashMap.class.getDeclaredField("table");
fieldHashMapTable.setAccessible(true);
}
static void dumpMap(HashMap<?, ?> map) throws ReflectiveOperationException {
Object[] table = (Object[])fieldHashMapTable.get(map);
System.out.printf("map size = %d, table length = %d%n", map.size(), table.length);
for (int i = 0; i < table.length; i++) {
Object node = table[i];
if (node == null)
continue;
System.out.printf("table[%d] = %s", i,
classTreeNode.isInstance(node) ? "TreeNode" : "BasicNode");
for (; node != null; node = fieldNodeNext.get(node))
System.out.print(" " + node);
System.out.println();
}
}
Now, let's add a bunch of strings that all fall into the same bucket. These strings are chosen such that their hash values, as computed by HashMap, are all 0 mod 64.
public static void main(String[] args) throws ReflectiveOperationException {
init();
List<String> list = List.of(
"LBCDD", "IKBNU", "WZQAG", "MKEAZ", "BBCHF", "KRQHE", "ZZMWH", "FHLVH",
"ZFLXM", "TXXPE", "NSJDQ", "BXDMJ", "OFBCR", "WVSIG", "HQDXY");
HashMap<String, String> map = new HashMap<>(1, 10.0f);
for (String s : list) {
System.out.println("===> put " + s);
map.put(s, s);
dumpMap(map);
}
}
Starting from an initial table size of 1 and a ridiculous load factor, this puts 8 entries into the lone bucket. Then, each time another entry is added, the table is resized (doubled) but all the entries end up in the same bucket. This eventually results in a table of size 64 with one bucket having a linear chain of nodes ("basic nodes") of length 14, before adding the next entry finally converts this to a tree.
Output of the program is as follows:
===> put LBCDD
map size = 1, table length = 1
table[0] = BasicNode LBCDD=LBCDD
===> put IKBNU
map size = 2, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU
===> put WZQAG
map size = 3, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG
===> put MKEAZ
map size = 4, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ
===> put BBCHF
map size = 5, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF
===> put KRQHE
map size = 6, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE
===> put ZZMWH
map size = 7, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH
===> put FHLVH
map size = 8, table length = 1
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH
===> put ZFLXM
map size = 9, table length = 2
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM
===> put TXXPE
map size = 10, table length = 4
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM TXXPE=TXXPE
===> put NSJDQ
map size = 11, table length = 8
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM TXXPE=TXXPE NSJDQ=NSJDQ
===> put BXDMJ
map size = 12, table length = 16
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM TXXPE=TXXPE NSJDQ=NSJDQ BXDMJ=BXDMJ
===> put OFBCR
map size = 13, table length = 32
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM TXXPE=TXXPE NSJDQ=NSJDQ BXDMJ=BXDMJ OFBCR=OFBCR
===> put WVSIG
map size = 14, table length = 64
table[0] = BasicNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM TXXPE=TXXPE NSJDQ=NSJDQ BXDMJ=BXDMJ OFBCR=OFBCR WVSIG=WVSIG
===> put HQDXY
map size = 15, table length = 64
table[0] = TreeNode LBCDD=LBCDD IKBNU=IKBNU WZQAG=WZQAG MKEAZ=MKEAZ BBCHF=BBCHF KRQHE=KRQHE ZZMWH=ZZMWH FHLVH=FHLVH ZFLXM=ZFLXM TXXPE=TXXPE NSJDQ=NSJDQ BXDMJ=BXDMJ OFBCR=OFBCR WVSIG=WVSIG HQDXY=HQDXY