Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
94 views
in Technique[技术] by (71.8m points)

Best and quickest way to get top N elements from a huge list in python

I am trying to workout the best solution for getting the top N elements with biggest number from a huge list with size of a few billions. So far, I have got the idea of:

get the first N elements, sort them in descending order (list A). 
for N+1 to last element:
    min = the Nth element. 
    if the N+1 element > min then insert it into list A and sort it. 
        remove the last element

Practically, seems like it doesn't consume too much memory, and faster than just using list.sort of the entire huge list follow by getting top N elements

However, this sorting doesn't use the full capacity of the CPU with multi-cores. Is there any built-in function or any other approaches that would do the job with multi-processes? or able to fully utilizes the computing capabilities which would result much faster?

question from:https://stackoverflow.com/questions/65839988/best-and-quickest-way-to-get-top-n-elements-from-a-huge-list-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you are looking to use parallelize the work, you could use a python library such as Ray.

Using Ray, you could parallelize your search by partitioning the data into multiple sets and having each thread attempt to find the largest N numbers of each subset. Afterwards, you should have k lists of N 'large' numbers. From there, you can find the largest N numbers.

If you would like to learn more about Ray documentation, you can check out the documentation.

Documentation: https://docs.ray.io/en/latest/


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...