I have an osmnx graph and about 6m query coordinates to find the nearest nodes within the graph. I want to parallelize the tasks and I tried joblib and the method described for shortest path [https://github.com/gboeing/osmnx-examples/blob/v0.16.0/notebooks/02-routing-speed-time.ipynb][1]. However both trials, I am using only 1 CPU out of 15 available.
I don't know much about multiprocessing and just tried to imitate the solution for shortest path. Am I doing something wrong or the method is not applicable to the osmnx.get_nearest_node(..) function? Are there any other suggestions to speed up the process?
import pandas as pd
import osmnx as ox
ox.config(use_cache=True, log_console=True)
G = ox.graph_from_bbox(38.289, 36.898, -122.704, -121.214 ,
network_type='drive',
retain_all = False,
simplify=True)
origins = [(37.775, -122.216),(37.458, -121.913),(37.558, -122.258), (37.791, -122.413),
(37.775, -122.219),(37.773, -121.952),(37.773, -121.926),
(37.332, -122.003),(37.462, -122.228),(37.701, -122.089),
(37.696, -122.143),(37.931, -122.323),(37.558, -122.273),
(37.357, -121.902),(37.462, -122.228),(37.791, -122.416),
(37.922, -122.368),(37.551, -122.291),(37.701, -122.088),
(37.802, -122.267),(38.015, -122.015),(37.701, -122.088),
(37.503, -122.267), (37.791, -122.416), (37.551, -122.301),
(37.405, -122.061), (37.228, -121.877), (37.326, -121.814),
(37.292, -122.032), (37.722, -122.15), (37.966, -122.507),
(37.773, -121.989), (37.294, -121.895), (37.881, -122.127),
(37.872, -122.14), (37.551, -122.307), (37.404, -121.976),
(37.775, -122.209), (37.791, -122.413), (37.228, -121.878)]
import multiprocessing as mp
def nn(G,origin):
try:
return ox.get_nearest_node(G, origin)
except:
return np.nan
params = ((G, orig) for orig in origins)
pool = mp.Pool(15)
sma = pool.starmap_async(nn, params)
routes = sma.get()
pool.close()
pool.join()
Also an interesting detail, when I timed it I get following result:
CPU times: user 3min 58s, sys: 8.91 s, total: 4min 6s
Wall time: 4min 15s
When I use a simple for loop with the same graph and points its much faster:
for origin in origins:
r = ox.get_nearest_node(G, origin)
CPU times: user 10.8 s, sys: 39.9 ms, total: 10.8 s
Wall time: 10.8 s
[1]: https://github.com/gboeing/osmnx-examples/blob/v0.16.0/notebooks/02-routing-speed-time.ipynb
question from:
https://stackoverflow.com/questions/65894491/trying-to-parallelize-osmnx-get-nearest-node-function-with-several-ways-st