Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
770 views
in Technique[技术] by (71.8m points)

hash - Time complexity of accessing a Python dict

I am writing a simple Python program.

My program seems to suffer from linear access to dictionaries, its run-time grows exponentially even though the algorithm is quadratic.
I use a dictionary to memoize values. That seems to be a bottleneck.

The values I'm hashing are tuples of points. Each point is: (x,y), 0 <= x,y <= 50
Each key in the dictionary is: A tuple of 2-5 points: ((x1,y1),(x2,y2),(x3,y3),(x4,y4))

The keys are read many times more often than they are written.

Am I correct that python dicts suffer from linear access times with such inputs?

As far as I know, sets have guaranteed logarithmic access times.
How can I simulate dicts using sets(or something similar) in Python?

edit As per request, here's a (simplified) version of the memoization function:

def memoize(fun):
    memoized = {}
    def memo(*args):
        key = args
        if not key in memoized:
            memoized[key] = fun(*args)
        return memoized[key]
    return memo
Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

See Time Complexity. The python dict is a hashmap, its worst case is therefore O(n) if the hash function is bad and results in a lot of collisions. However that is a very rare case where every item added has the same hash and so is added to the same chain which for a major Python implementation would be extremely unlikely. The average time complexity is of course O(1).

The best method would be to check and take a look at the hashs of the objects you are using. The CPython Dict uses int PyObject_Hash (PyObject *o) which is the equivalent of hash(o).

After a quick check, I have not yet managed to find two tuples that hash to the same value, which would indicate that the lookup is O(1)

l = []
for x in range(0, 50):
    for y in range(0, 50):
        if hash((x,y)) in l:
            print "Fail: ", (x,y)
        l.append(hash((x,y)))
print "Test Finished"

CodePad (Available for 24 hours)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...