Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.8k views
in Technique[技术] by (71.8m points)

python - Multiprocessing an array in chunks

I have broken my standard deviation function into small chunks of std_1, std_2, std_3 etc. to optimize my code to make it run faster. Since I have over 2 million arrays on my main numpy array PC_list. I have used the numba, numpy arrays and multi processing to make the code run faster however I do not see any performance difference in the way even doe the code is broken into pieces from the main function. It takes about 57 seconds for the main function to process and the divided function to compute which makes me believe that the multi processing does not work. Although it takes about 12 to 13 seconds to go through each divided function std1, std2, std3, std4, std5 std6. The PC_list array posted below is a short bit of the actual array. I have also tried running the main function solely on numpy but it hasn't worked and has given me a memory issue, link to previouse issue prev issue.

PC_list array:

number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

Main function(entire list length used) takes 58 seconds to run:

@njit
def std():
    std1 = np.array([PC_list[i:i+number].std() for i in range(0, (len(PC_list)-number))])
    return(std1)

Time it takes for each chunk to run: enter image description here

multi processing functions(list split into chunks) takes 58 seconds to run

divide = (len(PC_list)-number)/6
@njit
def std_1():
    print("std1")
    std1 = np.array([PC_list[i:i+number].std() for i in range(0, round(divide))])
    return(std1)
@njit
def std_2():
    print("std2")
    std2 = np.array([PC_list[i:i+number].std() for i in range(round(divide), round(divide*2))])
    return(std2)
@njit
def std_3():
    print("std3")
    std3 = np.array([PC_list[i:i+number].std() for i in range(round(divide*2), round(divide*3))])
    return(std3)
@njit
def std_4():
    print("std4")
    std4 = np.array([PC_list[i:i+number].std() for i in range(round(divide*3), round(divide*4))])
    return(std4)
@njit
def std_5():
    print("std5")
    std5 = np.array([PC_list[i:i+number].std() for i in range(round(divide*4), round(divide*5))])
    return(std5)
@njit
def std_6():
    print("std6")
    std6 = np.array([PC_list[i:i+number].std() for i in range(round(divide*5), round(divide*6))])
    return(std6) 
#setting up the multi processing vars     
p1 = multiprocessing.Process(target=std_1)
p2 = multiprocessing.Process(target=std_2)
p3 = multiprocessing.Process(target=std_3)
p4 = multiprocessing.Process(target=std_4)
p5 = multiprocessing.Process(target=std_5)
p6 = multiprocessing.Process(target=std_6)
#running the multi processes 
if __name__ == "__main__":
    p1.start()
    p2.start()
    p3.start()
    p4.start()
    p5.start()
    p6.start()
std1 = std_1()
std2 = std_2()
std3 = std_3()
std4 = std_4()
std5 = std_5()
std6 = std_6()
#Combinning all the standard deviation chunks together
std = np.concatenate((std1, std2, std3, std4, std5, std6))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
等待大神解答

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...