I have a large Python code base which we recently started compiling with Cython. Without making any changes to the code, I expected performance to stay about the same, but we planned to optimize heavier computations with Cython specific code after profiling. However, the speed of the compiled application plummeted and it appears to be across the board. Methods are taking anywhere from 10% to 300% longer than before.
I've been playing around with test code to try and find things Cython does poorly and it appears that string manipulation is one of them. My question is, am I doing something wrong or is Cython really just bad at some things? Can you help me understand why this is so bad and what else Cython might do very poorly?
EDIT: Let me try to clarify. I realize that this type of string concatenation is very bad; I just noticed it has a huge speed difference so I posted it (probably a bad idea). The codebase doesn't have this type of terrible code but has still slowed dramatically and I'm hoping for pointers on what type of constructs Cython handles poorly so I can figure out where to look. I've tried profiling but it was not particularly helpful.
For reference, here is my string manipulation test code. I realize the code below is terrible and useless, but I'm still shocked by the speed difference.
# pyCode.py
def str1():
val = ""
for i in xrange(100000):
val = str(i)
def str2():
val = ""
for i in xrange(100000):
val += 'a'
def str3():
val = ""
for i in xrange(100000):
val += str(i)
Timing code
# compare.py
import timeit
pyTimes = {}
cyTimes = {}
# STR1
number=10
setup = "import pyCode"
stmt = "pyCode.str1()"
pyTimes['str1'] = timeit.timeit(stmt=stmt, setup=setup, number=number)
setup = "import cyCode"
stmt = "cyCode.str1()"
cyTimes['str1'] = timeit.timeit(stmt=stmt, setup=setup, number=number)
# STR2
setup = "import pyCode"
stmt = "pyCode.str2()"
pyTimes['str2'] = timeit.timeit(stmt=stmt, setup=setup, number=number)
setup = "import cyCode"
stmt = "cyCode.str2()"
cyTimes['str2'] = timeit.timeit(stmt=stmt, setup=setup, number=number)
# STR3
setup = "import pyCode"
stmt = "pyCode.str3()"
pyTimes['str3'] = timeit.timeit(stmt=stmt, setup=setup, number=number)
setup = "import cyCode"
stmt = "cyCode.str3()"
cyTimes['str3'] = timeit.timeit(stmt=stmt, setup=setup, number=number)
for funcName in sorted(pyTimes.viewkeys()):
print "PY {} took {}s".format(funcName, pyTimes[funcName])
print "CY {} took {}s".format(funcName, cyTimes[funcName])
Compiling a Cython module with
cp pyCode.py cyCode.py
cython cyCode.py
gcc -O2 -fPIC -shared -I$PYTHONHOME/include/python2.7
-fno-strict-aliasing -fno-strict-overflow -o cyCode.so cyCode.c
Resulting timings
> python compare.py
PY str1 took 0.1610019207s
CY str1 took 0.104282140732s
PY str2 took 0.0739600658417s
CY str2 took 2.34380102158s
PY str3 took 0.224936962128s
CY str3 took 21.6859738827s
For reference, I've tried this with Cython 0.19.1 and 0.23.4. I've compiled the C code with gcc 4.8.2 and icc 14.0.2, trying various flags with both.
See Question&Answers more detail:
os