I am trying using the itertools.product
function to make a segment of my code (in an isotopic pattern simulator) easier to read and hopefully faster as well (the documentation states that no intermediate results are created) , I have however tested both versions of the code against each other using the cProfiling
library and noticed that the itertools.product
was significantly slower than my nested for loops.
Example values used for the testing:
carbons = [(0.0, 0.004613223957020534), (1.00335, 0.02494768843632857), (2.0067, 0.0673219412049374), (3.0100499999999997, 0.12087054681917497), (4.0134, 0.16243239687902825), (5.01675, 0.17427700732161705), (6.020099999999999, 0.15550695260604208), (7.0234499999999995, 0.11869556397525197), (8.0268, 0.07911287899598853), (9.030149999999999, 0.04677626606764402)]
hydrogens = [(0.0, 0.9417611429667746), (1.00628, 0.05651245007201512)]
nitrogens = [(0.0, 0.16148864310897554), (0.99703, 0.2949830688288726), (1.99406, 0.26887643366755537), (2.99109, 0.16305943261399866), (3.98812, 0.0740163089529218), (4.98515, 0.026824040474519875), (5.98218, 0.008084687617425748)]
oxygens17 = [(0.0, 0.8269292736927519), (1.00422, 0.15717628899143962), (2.00844, 0.014907548827832968)]
oxygens18 = [(0.0, 0.3584191873916266), (2.00425, 0.36813434247849824), (4.0085, 0.18867830334103902), (6.01275, 0.06433912182670033), (8.017, 0.016421642936302827)]
sulfurs33 = [(0.0, 0.02204843659673093), (0.99939, 0.08442569434459646), (1.99878, 0.16131398792444965), (2.99817, 0.2050722764666321), (3.99756, 0.1951327596407101), (4.99695, 0.14824112268069747), (5.99634, 0.09365899226198841), (6.99573, 0.050618028523695714), (7.99512, 0.023888506307006133), (8.99451, 0.010000884811585533)]
sulfurs34 = [(0.0, 3.0106350597190195e-10), (1.9958, 6.747270089956428e-09), (3.9916, 7.54568412614702e-08), (5.9874, 5.614443102700176e-07), (7.9832, 3.1268212758750728e-06), (9.979, 1.3903197959791067e-05), (11.9748, 5.141248916434075e-05), (13.970600000000001, 0.0001626288218672788), (15.9664, 0.00044921518047309414), (17.9622, 0.0011007203440032396)]
sulfurs36 = [(0.0, 0.904828368500412), (3.99501, 0.0905009370374487)]
Snippet demonstrating nested for loops:
totals = []
for i in carbons:
for j in hydrogens:
for k in nitrogens:
for l in oxygens17:
for m in oxygens18:
for n in sulfurs33:
for o in sulfurs34:
for p in sulfurs36:
totals.append((i[0]+j[0]+k[0]+l[0]+m[0]+n[0]+o[0]+p[0], i[1]*j[1]*k[1]*l[1]*m[1]*n[1]*o[1]*p[1]))
Snippet demonstrating the use of itertools.product
:
totals = []
for i in itertools.product(carbons,hydrogens,nitrogens,oxygens17,oxygens18,sulfurs33,sulfurs34,sulfurs36):
massDiff = i[0][0]
chance = i[0][1]
for j in i[1:]:
massDiff += j[0]
chance = chance * j[1]
totals.append((massDiff,chance))
The results from profiling (based on 10 runs per method) was an average of ~0.8 seconds for the nested for loop approach and ~1.3 seconds for the itertools.product
approach. My question is thus, am I using the itertools.product
function wrongly or should I just stick to the nested for loops?
-- UPDATE --
I have included two of my cProfile
results:
# ITERTOOLS.PRODUCT APPROACH
420003 function calls in 1.306 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.018 0.018 1.306 1.306 <string>:1(<module>)
1 1.246 1.246 1.289 1.289 IsotopeBas.py:64(option1)
420000 0.042 0.000 0.042 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
and:
# NESTED FOR LOOP APPROACH
420003 function calls in 0.830 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.019 0.019 0.830 0.830 <string>:1(<module>)
1 0.769 0.769 0.811 0.811 IsotopeBas.py:78(option2)
420000 0.042 0.000 0.042 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
See Question&Answers more detail:
os