Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
205 views
in Technique[技术] by (71.8m points)

python - Programing and optimizing code for a Basin Hopping method

So I'm programing a Basin Hoping from zero to find the potential minima of a system of interacting particles and I have obtained good results so far, but since since I increase the number of particles of the system, the code takes more time to run. I am using scipy conjugate gradient for finding local minima. I'm not getting any error messages and the program seems to work out fine, but I was wondering how I can optimize the code so the computing time is reduced.

I defined the Basin Hopping as a function, given:

def bhmet(n,itmax, pot,derpot, T):
i = 1 
phi = np.random.rand(n)*2*np.pi
theta = np.random.rand(n)*np.pi
x = 3*np.sin(theta)*np.cos(phi)
y = 3*np.sin(theta)*np.sin(phi)
z = 3*np.cos(theta)
xyzat = np.hstack((x,y,z))
vmintot = 0 
while i <= itmax: 
    print(i)   
    plmin = optimize.fmin_cg(pot,xyzat,derpot,gtol = 1e-5,) #posiciones para el mínimo local.4
    if pot(plmin) < vmintot or np.exp((1/T)*(vmintot-pot(plmin))) > np.random.rand():
        vmintot = pot(plmin)
    xyzat = plmin + 2*0.3*(np.random.rand(len(plmin))-0.5)
    i = i + 1  
return plmin,vmintot 

I tried defining the initial condition (the first 'xyzat') as a matrix, but the parameter of scipy.optimize.fmin_cg is requested as an array (and thats why in the functions I will be reshaping the array as a matrix).

The function I'm searching the global minima for is:

def ljpot(posiciones,):
r = np.array([])
matpos = np.zeros((int((1/3)*len(posiciones)),3))
matpos[:,0] = posiciones[0:int((1/3)*len(posiciones))]
matpos[:,1] = posiciones[int((1/3)*len(posiciones)):int((2/3)*len(posiciones))]
matpos[:,2] = posiciones[int((2/3)*len(posiciones)):]    
for j in range(0,np.shape(matpos)[0]):
    for k in range(j+1,np.shape(matpos)[0]):
        ri = np.sqrt(sum((matpos[k,:]-matpos[j,:])**2))
        r = np.append(r,ri) 
V = 4*((1/r)**12-(1/r)**6)
vt = sum(V)
return vt

And its gradient would be:

def gradpot(posiciones,):
gradv = np.array([])
matposg = np.zeros((int((1/3)*len(posiciones)),3))
matposg[:,0] = posiciones[:int((1/3)*len(posiciones))]
matposg[:,1] = posiciones[int((1/3)*len(posiciones)):int((2/3)*len(posiciones))]
matposg[:,2] = posiciones[int((2/3)*len(posiciones)):]   
for w in range(0,np.shape(matposg)[1]): #índice que cambia de columna. 
    for k in range(0,np.shape(matposg)[0]): #índice que cambia de fila.
        rkj = np.array([])
        xkj = np.array([])
        for j in range(0,np.shape(matposg)[0]): #también este cambia de fila. 
            if j != k:
                r = np.sqrt(sum((matposg[j,:]-matposg[k,:])**2))
                rkj = np.append(rkj,r) 
                xkj = np.append(xkj,matposg[j,w])
        dEdxj = sum(4*(6*(1/rkj)**8- 12*(1/rkj)**14)*(matposg[k,w]-xkj))
        gradv = np.append(gradv,dEdxj)
return gradv

The reason I transform the array input in a matrix is that for each particle there are three coordinates x,y,z so the columns of the matrix are x's, y's, z's for every particle. I tried to do this using np.reshape(), but it seemed to give me wrong results for systems for which the program has already gotten the correct ones.

The code seems to work out fine, but as I increase the number of particles the running time increases exponentially. I know global optimization can take long time, but maybe I'm messing a little with the code. I don′t know if the way to reduce the time of running is pretty obvious,I'm kinda new to optimizing the code, so sorry if it's the case. Of course, any advices are welcome. Thank you all very much!

question from:https://stackoverflow.com/questions/65875730/programing-and-optimizing-code-for-a-basin-hopping-method

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Two things I noticed after a quick look where you can definitely save some time. Both require some more thinking before, but afterwards you will be rewarded with an optimised and cleaner code.


1. Try to avoid using append. append is highly inefficient. You start with an empty array, and afterwards need to allocate more memory each time. This leads to inefficient memory handling, as you copy your array each time you append a number. The longer the array gets, the more inefficient append becomes.

Alternative: Use np.zeros((m,n)) to initialise the array, with m and n being the size it will end up with. Then you need a counter that puts the new values in the corresponding places. If the size of the array is not defined before your calculation, you can initialise it as a big number, and afterwards cut it.


2. Try to avoid using for loops. They are generally very slow, especially when iterating through large matrices, as you need to index each entry individually.

Alternative: Matrix operations are generally much faster. For example, instead of r = np.sqrt(sum((matposg[j,:]-matposg[k,:])**2)) inside two for loops, you could first define two matrices A and B that correspond to matposg[j,:] and matposg[k,:] (should be possible without the use of loops!) and then simply use r = np.linalg.norm(A-B)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...