Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
650 views
in Technique[技术] by (71.8m points)

multiprocessing - Python ProcessPoolExecutor with conditions

I have to process a large amount of image data and would like to use the .map() function from the concurrent.futures package to speed it up. The goal is to loop over all the images in a directory, process them, and then save them in another directory. This in itself is not a problem but I would like to save 90% of the processed images in one directory and the remaining 10% in another directory. How can I do this using .map()?

Without .map() I enumerate the images and then say:

if enumerator < (len(directory) * 0.9):
     save image in one directory
else:
     save image in another directory

How can I add this to the function I call with .map(), since I don't have access to the enumerator anymore?

Any help is very much appreciated!

All the best, snowe


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use additional arguments to the map function, these arguments should be iterators, 1 element from each iterator will be passed to each iteration your job pool goes through:

def my_function(file, sorting_bool):
  if sorting_bool:
    # do this with `file`
  else:
    # do that with `file`

total = len(directory)
sorter = lambda x: x < 0.9 * total
dir_sorted = map(sorter, range(total))
pool.map(my_function, directory, dir_sorted)

In general for other tasks you could send a job id and total id to your job:

def my_function(file, job_id, total_jobs):
  if job_id < total_jobs * 0.9:
    # Do this
  else:
    # Do that

total = len(directory)
pool.map(my_function, directory, range(total), lambda: total)

And then use those numbers however you'd like inside of your my_function

If you have an unknown number of total jobs you could still create a generator to create a counter:

def counter():
  i = 0
  while True:
    yield i
    i += 1

pool.map(my_function, counter(), other, args)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...