Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
404 views
in Technique[技术] by (71.8m points)

python - Numpy: efficient way to generate combinations from given ranges

I have a n-dimension array as shown below:

np.array([[0,3],[0,3],[0,10]])

In this array, the elements denote the low and high values. Ex: [0,3] refers to [0,1,2,3]

I need to generate a combination of all values using the ranges given as above. For example, I want [0,0,0], [0,0,1] ... [0,1,0] ... [3,3,10]

I have tried the following to get what I want:

ds = np.array([[0,3],[0,3],[0,10]])
nItems = int(reduce(lambda a,b: a * (b[1] - b[0] + 1), ds, 1))
myCombinations = np.zeros((nItems,))
nArrays = []
for x in range(ds.shape[0]):
    low = ds[x][0]
    high= ds[x][1]
    nitm = high - low + 1
    ar = [x+low for x in range(nitm) ]
    nArrays.append(ar)

myCombinations = cartesian(nArrays)

The cartesian function was taken from Using numpy to build an array of all combinations of two arrays

I need to do this few million times.

My question: is there any better / efficient way to do this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think what you're looking for is np.mgrid. Unfortunately, this returns the array in a format that's different from what you need, so you'll need to do a little post-processing:

a = np.mgrid[0:4, 0:4, 0:11]     # All points in a 3D grid within the given ranges
a = np.rollaxis(a, 0, 4)         # Make the 0th axis into the last axis
a = a.reshape((4 * 4 * 11, 3))   # Now you can safely reshape while preserving order

Explanation

np.mgrid gives you a set of grid points in N-dimensional space. Let me try to show this with a smaller example, to make things clearer:

>>> a = np.mgrid[0:2, 0:2]
>>> a
array([[[0, 0],
        [1, 1]],

       [[0, 1],
        [0, 1]]])

Since I've given two sets of ranges, 0:2, 0:2, I get a 2D grid. What mgrid returns is the x-values and the y-values corresponding to the grid points (0, 0), (0, 1), (1, 0) and (1, 1) in 2D space. a[0] tells you what the x-values of the four points are, and a[1] tells you what the y-values are.

But what you really want is that list of actual grid points that I've written out, not the x- and y-values of those points separately. First instinct is to just reshape the array as desired:

>>> a.reshape((4, 2))
array([[0, 0],
       [1, 1],
       [0, 1],
       [0, 1]])

But clearly this doesn't work, because it effectively reshapes the flattened array (the array obtained by just reading all elements in order), and that's not what you want.

What you want to do is to look down the third dimension of a, and create an array:

[ [a[0][0, 0], a[1][0, 0]],
  [a[0][0, 1], a[1][0, 1]],
  [a[0][1, 0], a[1][1, 0]],
  [a[0][1, 1], a[1][1, 1]] ]

which reads "First tell me the first point (x1, y1), then the second point (x2, y2), ..." and so on. Perhaps this is better explained with a figure, of sorts. This is what a looks like:

                you want to read
                in this direction
                 (0, 0)   (0, 1)
                   |        |
                   |        |
                   v        v

          /        0--------0            +----> axis0
 x-values |       /|       /|           /|
          |      / |      / |    axis1 / |
               1--------1  |         L  |
                |  |     |  |            v
          /     |  0-----|--1           axis2
 y-values |     | /      | /
          |     |/       |/
               0--------1

                |        |
                |        |
                v        v
              (1, 0)   (1, 1)

np.rollaxis gives you a way to do this. np.rollaxis(a, 0, 3) in the above example says "take the 0th (or outermost) axis and make it into the last (or innermost) axis. (Note: only axes 0, 1 and 2 actually exist here. So saying "send the 0th axis to the 3rd position" is a way of telling python to put the 0th axis after the last axis). You might also want to read this.

>>> a = np.rollaxis(a, 0, 3)
>>> a
array([[[0, 0],
        [0, 1]],

       [[1, 0],
        [1, 1]]])

This is starting to look like what you want, except there's an extra array dimension. We want to merge dimensions 0 and 1 to get just get a single array of grid points. But now that the flattened array reads in the manner that you expect, you can safely reshape it to give you the desired result.

>>> a = a.reshape((4, 2))
>>> a
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

The 3D version does just the same thing, except, I couldn't make a figure for that, since it'd be in 4D.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...