Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
469 views
in Technique[技术] by (71.8m points)

python - Using a dictionary in Cython , especially inside nogil

I am having a dictionary,

my_dict = {'a':[1,2,3], 'b':[4,5] , 'c':[7,1,2])

I want to use this dictionary inside a Cython nogil function . So , I tried to declare it as

cdef dict cy_dict = my_dict 

Up to this stage is fine.

Now I need to iterate over the keys of my_dict and if the values are in list, iterate over it. In Python , it is quite easy as follows:

 for key in my_dict:
      if isinstance(my_dict[key], (list, tuple)):
          ###### Iterate over the value of the list or tuple
          for value in list:
               ## Do some over operation.

But, inside Cython, I want to implement the same that too inside nogil . As, python objects are not allowed inside nogil, I am all stuck up here.

with nogil:
    #### same implementation of the same in Cython

Can anyone please help me out ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can't use Python dict without the GIL because everything you could do with it involves manipulating Python objects. You most sensible option is to accept that you need the GIL. There's a less sensible option too involving C++ maps, but it may be hard to apply for your specific case.

You can use with gil: to reacquire the GIL. There is obvious an overhead here (parts using the GIL can't be executed in parallel, and there may be a delay which it waits for the GIL). However, if the dictionary manipulation is a small chunk of a larger piece of Cython code this may not be too bad:

with nogil:
  # some large chunk of computationally intensive code goes here
  with gil:
    # your dictionary code
  # more computationally intensive stuff here

The other less sensible option is to use C++ maps (along side other C++ standard library data types). Cython can wrap these and automatically convert them. To give a trivial example based on your example data:

from libcpp.map cimport map
from libcpp.string cimport string
from libcpp.vector cimport vector
from cython.operator cimport dereference, preincrement

def f():
    my_dict = {'a':[1,2,3], 'b':[4,5] , 'c':[7,1,2]}
    # the following conversion has an computational cost to it 
    # and must be done with the GIL. Depending on your design
    # you might be able to ensure it's only done once so that the
    # cost doesn't matter much
    cdef map[string,vector[int]] m = my_dict
    
    # cdef statements can't go inside no gil, but much of the work can
    cdef map[string,vector[int]].iterator end = m.end()
    cdef map[string,vector[int]].iterator it = m.begin()
        
    cdef int total_length = 0
    
    with nogil: # all  this stuff can now go inside nogil   
        while it != end:
            total_length += dereference(it).second.size()
            preincrement(it)
        
    print total_length

(you need to compile this with language='c++').

The obvious disadvantage to this is that the data-types inside the dict must be known in advance (it can't be an arbitrary Python object). However, since you can't manipulate arbitrary Python objects inside a nogil block you're pretty restricted anyway.

6-year later addendum: I don't recommend the "use C++ objects everywhere" approach as a general approach. The Cython-C++ interface is a bit clunky and you can spend a lot of time working around it. The Python containers are actually better than you think. Everyone tends to forget about the cost of converting their C++ objects to/from Python objects. People rarely consider if they really need to release the GIL or if they just read an article on the internet somewhere saying that the GIL is bad..

It's good for some tasks, but think carefully before blindly replacing all your list with vector, dict with map etc.. As a rule, if your C++ types live entirely within your function it may be a good move (but think twice...). If they're being converted as input or output arguments then think a third time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...