Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
237 views
in Technique[技术] by (71.8m points)

python - Improving Numpy Performance

I'd like to improve the performance of convolution using python, and was hoping for some insight on how to best go about improving performance.

I am currently using scipy to perform the convolution, using code somewhat like the snippet below:

import numpy
import scipy
import scipy.signal
import timeit

a=numpy.array ( [ range(1000000) ] )
a.reshape(1000,1000)
filt=numpy.array( [ [ 1, 1, 1 ], [1, -8, 1], [1,1,1] ] )

def convolve():
  global a, filt
  scipy.signal.convolve2d ( a, filt, mode="same" )

t=timeit.Timer("convolve()", "from __main__ import convolve")
print "%.2f sec/pass" % (10 * t.timeit(number=10)/100)

I am processing image data, using grayscale (integer values between 0 and 255), and I currently get about a quarter of a second per convolution. My thinking was to do one of the following:

Use corepy, preferably with some optimizations Recompile numpy with icc & ikml. Use python-cuda.

I was wondering if anyone had any experience with any of these approaches ( what sort of gain would be typical, and if it is worth the time ), or if anyone is aware of a better library to perform convolution with Numpy.

Thanks!

EDIT:

Speed up of about 10x by re-writing python loop in C over using Numpy.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The code in scipy for doing 2d convolutions is a bit messy and unoptimized. See http://svn.scipy.org/svn/scipy/trunk/scipy/signal/firfilter.c if you want a glimpse into the low-level functioning of scipy.

If all you want is to process with a small, constant kernel like the one you showed, a function like this might work:

def specialconvolve(a):
    # sorry, you must pad the input yourself
    rowconvol = a[1:-1,:] + a[:-2,:] + a[2:,:]
    colconvol = rowconvol[:,1:-1] + rowconvol[:,:-2] + rowconvol[:,2:] - 9*a[1:-1,1:-1]
    return colconvol

This function takes advantage of the separability of the kernel like DarenW suggested above, as well as taking advantage of the more optimized numpy arithmetic routines. It's over 1000 times faster than the convolve2d function by my measurements.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...