I was reading the javadocs on HashSet when I came across the interesting statement:
This class offers constant time performance for the basic operations (add, remove, contains and size)
This confuses me greatly, as I don't understand how one could possibly get constant time, O(1), performance for a comparison operation. Here are my thoughts:
If this is true, then no matter how much data I'm dumping into my HashSet, I will be able to access any element in constant time. That is, if I put 1 element in my HashSet, it will take the same amount of time to find it as if I had a googolplex of elements.
However, this wouldn't be possible if I had a constant number of buckets, or a consistent hash function, since for any fixed number of buckets, the number of elements in that bucket will grow linearly (albeit slowly, if the number is big enough) with the number of elements in the set.
Then, the only way for this to work is to have a changing hash function every time you insert an element (or every few times). A simple hash function that never any collisions would satisfy this need. One toy example for strings could be: Take the ASCII value of the strings and concatenate them together (because adding could result in a conflict).
However, this hash function, and any other hash function of this sort will likely fail for large enough strings or numbers etc. The number of buckets that you can form is immediately limited by the amount of stack/heap space you have, etc. Thus, skipping locations in memory can't be allowed indefinitely, so you'll eventually have to fill in the gaps.
But if at some point there's a recalculation of the hash function, this can only be as fast as finding a polynomial which passes through N points, or O(nlogn).
Thus arrives my confusion. While I will believe that the HashSet can access elements in O(n/B) time, where B is the number of buckets it has decided to use, I don't see how a HashSet could possibly perform add or get functions in O(1) time.
Note: This post and this post both don't address the concerns I listed..
See Question&Answers more detail:
os