I would like to implement itertools.combinations for numpy. Based on this discussion, I have a function that works for 1D input:
def combs(a, r):
"""
Return successive r-length combinations of elements in the array a.
Should produce the same output as array(list(combinations(a, r))), but
faster.
"""
a = asarray(a)
dt = dtype([('', a.dtype)]*r)
b = fromiter(combinations(a, r), dt)
return b.view(a.dtype).reshape(-1, r)
and the output makes sense:
In [1]: list(combinations([1,2,3], 2))
Out[1]: [(1, 2), (1, 3), (2, 3)]
In [2]: array(list(combinations([1,2,3], 2)))
Out[2]:
array([[1, 2],
[1, 3],
[2, 3]])
In [3]: combs([1,2,3], 2)
Out[3]:
array([[1, 2],
[1, 3],
[2, 3]])
However, it would be best if I could expand it to N-D inputs, where additional dimensions simply allow you to speedily do multiple calls at once. So, conceptually, if combs([1, 2, 3], 2)
produces [1, 2], [1, 3], [2, 3]
, and combs([4, 5, 6], 2)
produces [4, 5], [4, 6], [5, 6]
, then combs((1,2,3) and (4,5,6), 2)
should produce [1, 2], [1, 3], [2, 3] and [4, 5], [4, 6], [5, 6]
where "and" just represents parallel rows or columns (whichever makes sense). (and likewise for additional dimensions)
I'm not sure:
- How to make the dimensions work in a logical way that's consistent with the way other functions work (like how some numpy functions have an
axis=
parameter, and a default of axis 0. So probably axis 0 should be the one I am combining along, and all other axes just represent parallel calculations?)
- How to get the above code to work with ND (right now I get
ValueError: setting an array element with a sequence.
)
- Is there a better way to do
dt = dtype([('', a.dtype)]*r)
?
See Question&Answers more detail:
os