Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
199 views
in Technique[技术] by (71.8m points)

How to calculate HashMap memory usage in Java?

I was asked in an interview to calculate the memory usage for HashMap and how much estimated memory it will consume if you have 2 million items in it.

For example:

Map <String,List<String>> mp=new HashMap <String,List<String>>();

The mapping is like this.

key   value
----- ---------------------------
abc   ['hello','how']
abz   ['hello','how','are','you']

How would I estimate the memory usage of this HashMap Object in Java?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The short answer

To find out how large an object is, I would use a profiler. In YourKit, for example, you can search for the object and then get it to calculate its deep size. This will give a you a fair idea of how much memory would be used if the object were stand alone and is a conservative size for the object.

The quibbles

If parts of the object are re-used in other structures e.g. String literals, you won't free this much memory by discarding it. In fact discarding one reference to the HashMap might not free any memory at all.

What about Serialisation?

Serialising the object is one approach to getting an estimate, but it can be wildly off as the serialisation overhead and encoding is different in memory and to a byte stream. How much memory is used depends on the JVM (and whether its using 32/64-bit references), but the Serialisation format is always the same.

e.g.

In Sun/Oracle's JVM, an Integer can take 16 bytes for the header, 4 bytes for the number and 4 bytes padding (the objects are 8-byte aligned in memory), total 24 bytes. However if you serialise one Integer, it takes 81 bytes, serialise two integers and they takes 91 bytes. i.e. the size of the first Integer is inflated and the second Integer is less than what is used in memory.

String is a much more complex example. In the Sun/Oracle JVM, it contains 3 int values and a char[] reference. So you might assume it uses 16 byte header plus 3 * 4 bytes for the ints, 4 bytes for the char[], 16 bytes for the overhead of the char[] and then two bytes per char, aligned to 8-byte boundary...

What flags can change the size?

If you have 64-bit references, the char[] reference is 8 bytes long resulting in 4 bytes of padding. If you have a 64-bit JVM, you can use +XX:+UseCompressedOops to use 32-bit references. (So look at the JVM bit size alone doesn't tell you the size of its references)

If you have -XX:+UseCompressedStrings, the JVM will use a byte[] instead of a char array when it can. This can slow down your application slightly but could improve you memory consumption dramatically. When a byte[] in used, the memory consumed is 1 byte per char. ;) Note: for a 4-char String, as in the example, the size used is the same due to the 8-byte boundary.

What do you mean by "size"?

As has been pointed out, HashMap and List is more complex as many, if not all, the Strings can be reused, possibly String literals. What you mean by "size" depends on how it is used. i.e. How much memory would the structure use alone? How much would be freed if the structure were discarded? How much memory would be used if you copied the structure? These questions can have different answers.

What can you do without a profiler?

If you can determine that the likely conservative size, is small enough, the exact size doesn't matter. The conservative case is likely to where you construct every String and entry from scratch. (I only say likely as a HashMap can have capacity for 1 billion entries even though it is empty. Strings with a single char can be a sub-string of a String with 2 billion characters)

You can perform a System.gc(), take the free memory, create the objects, perform another System.gc() and see how much the free memory has reduced. You may need to create the object many times and take an average. Repeat this exercise many times, but it can give you a fair idea.

(BTW While System.gc() is only a hint, the Sun/Oracle JVM will perform a Full GC every time by default)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...