Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
269 views
in Technique[技术] by (71.8m points)

python - pretty printing numpy ndarrays using unicode characters

I have recently noticed that Python printing functionality is not consistent for NumPy ndarays. For example it prints a horizontal 1D array horizontally:

import numpy as np
A1=np.array([1,2,3])
print(A1)
#--> [1 2 3]

but a 1D horizontal array with redundant brackets vertically:

A2=np.array([[1],[2],[3]])
print(A2)
#--> [[1]
#     [2]
#     [3]]

a 1D vertical array horizontally:

A3=np.array([[1,2,3]])
print(A3)
#--> [[1 2 3]]

and a 2D array:

B=np.array([[11,12,13],[21,22,23],[31,32,32]])
print(B)
# --> [[11 12 13]
#      [21 22 23]
#      [31 32 32]]

where the first dimension is now vertical. It gets even worse for higher dimensions as all of them are printed vertically:

C=np.array([[[111,112],[121,122]],[[211,212],[221,222]]])
print(C)
#--> [[[111 112]
#      [121 122]]
#
#     [[211 212]
#      [221 222]]]

A consistent behavior in my opinion would be to print the even dimensions horizontally and odd ones vertically. Using Unicode characters it would be possible to format it nicely. I was wondering if it is possible to create a function to print above arrays as:

A1 --> [1 2 3]
A2 --> ┌┌─┐┌─┐┌─┐┐
       │ 1  2  3 │
       └└─┘└─┘└─┘┘
A3 --> ┌┌─┐┐ # u250cu2500u2510 
       │ 1 │ # u2502
       │ 2 │
       │ 3 │
       └└─┘┘ # u2514u2500u2518 
B -->  ┌┌──┐┌──┐┌──┐┐ 
       │ 11  21  31 │
       │ 12  22  32 │
       │ 13  23  33 │
       └└──┘└──┘└──┘┘ 

C -->  ┌┌─────────┐┌─────────┐┐
       │ [111 112]  [211 212] │
       │ [121 122]  [221 222] │
       └└─────────┘└─────────┘┘ 

I found this gist which takes care of the different number of digits. I tried to prototype a recursive function to implement the above concept:

 def npprint(A):
     assert isinstance(A, np.ndarray), "input of npprint must be array like"
     if A.ndim==1 :
         print(A)
     else:
         for i in range(A.shape[1]):
             npprint(A[:,i]) 

It kinda works for A1, A2, A3 and B but not for C. I would appreciate if you could help me know how the npprint should be to achieve above output for arbitrary dimension numpy ndarrays?

P.S.1. In Jupyter environment one can use LaTeX mathtools underbracket and overbracket in Markdown. Sympy's pretty printing functionality is also a great start point. It can use ASCII, Unicode, LaTeX...

P.S.2. I'm being told that there is indeed a consistency in the way ndarrays are being printed. however IMHO it is kind of wired and non-intuitive. Having a flexible pretty printing function could help a lot to display ndarrays in different forms.

P.S.3. Sympy guys have already considered both points I have mentioned here. their Matrix module is pretty consistent (A1 and A2 are the same) and they also have a pprint function which does kind of the same thing and I expect from npprint here.

P.S.4. For those who follow up this idea I have integrated everythin here in this Jupyter Notebook

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It was quite a revelation to me understanding numpy arrays are not anything like MATLAB matrices or multidimensional mathematical arrays I had in mind. They are rather homogeneous and uniform nested Python lists. I also understood that the first dimension of a numpy array is the most deepest/inner pairs of square brackets which is printed horizontally and then from there second dimension is printed vertically, Third vertically with a spaced line...

Anyways I thing having an ppring function (inspired by Sympy's naming convention) could help a lot. so I'm going to put a very bad implementation here hoping it will inspire other advanced Pythoners to come up with better solutions:

def pprint(A):
    if A.ndim==1:
        print(A)
    else:
        w = max([len(str(s)) for s in A]) 
        print(u'u250c'+u'u2500'*w+u'u2510') 
        for AA in A:
            print(' ', end='')
            print('[', end='')
            for i,AAA in enumerate(AA[:-1]):
                w1=max([len(str(s)) for s in A[:,i]])
                print(str(AAA)+' '*(w1-len(str(AAA))+1),end='')
            w1=max([len(str(s)) for s in A[:,-1]])
            print(str(AA[-1])+' '*(w1-len(str(AA[-1]))),end='')
            print(']')
        print(u'u2514'+u'u2500'*w+u'u2518')  

and the result is somewhat acceptable for 1D and 2D arrays:

B1=np.array([[111,122,133],[21,22,23],[31,32,33]])
pprint(B1)

#┌─────────────┐
# [111 122 133]
# [21  22  23 ]
# [31  32  33 ]
#└─────────────┘

this is indeed a very bad code, it only works for integers. hopefully others will come up with better solutions.

P.S.1. Eric Wieser has already implemented a very nice HTML prototype for IPython/Jupiter which can seen here:

enter image description here

You may follow the discussion on numpy mailing list here.

P.S.2. I also posted this idea here on Reddit.

P.S.3 I spent some time to extend the code to 3D dimensional arrays:

def ndtotext(A, w=None, h=None):
    if A.ndim==1:
        if w == None :
            return str(A)
        else:
            s= '['
            for i,AA in enumerate(A[:-1]):
                s += str(AA)+' '*(max(w[i],len(str(AA)))-len(str(AA))+1)
            s += str(A[-1])+' '*(max(w[-1],len(str(A[-1])))-len(str(A[-1]))) +'] '
    elif A.ndim==2:
        w1 = [max([len(str(s)) for s in A[:,i]])  for i in range(A.shape[1])]
        w0 = sum(w1)+len(w1)+1
        s= u'u250c'+u'u2500'*w0+u'u2510' +'
'
        for AA in A:
            s += ' ' + ndtotext(AA, w=w1) +'
'    
        s += u'u2514'+u'u2500'*w0+u'u2518'
    elif A.ndim==3:
        h=A.shape[1]
        s1=u'u250c' +'
' + (u'u2502'+'
')*h + u'u2514'+'
'
        s2=u'u2510' +'
' + (u'u2502'+'
')*h + u'u2518'+'
'
        strings=[ndtotext(a)+'
' for a in A]
        strings.append(s2)
        strings.insert(0,s1)
        s='
'.join(''.join(pair) for pair in zip(*map(str.splitlines, strings)))
    return s

and as an example:

shape = 4, 3, 6
B2=np.arange(np.prod(shape)).reshape(shape)
print(B2)
print(ndtotext(B2))        


[[[ 0  1  2  3  4  5]
  [ 6  7  8  9 10 11]
  [12 13 14 15 16 17]]

 [[18 19 20 21 22 23]
  [24 25 26 27 28 29]
  [30 31 32 33 34 35]]

 [[36 37 38 39 40 41]
  [42 43 44 45 46 47]
  [48 49 50 51 52 53]]

 [[54 55 56 57 58 59]
  [60 61 62 63 64 65]
  [66 67 68 69 70 71]]]
┌┌───────────────────┐┌───────────────────┐┌───────────────────┐┌───────────────────┐┐
│ [0  1  2  3  4  5 ]  [18 19 20 21 22 23]  [36 37 38 39 40 41]  [54 55 56 57 58 59] │
│ [6  7  8  9  10 11]  [24 25 26 27 28 29]  [42 43 44 45 46 47]  [60 61 62 63 64 65] │
│ [12 13 14 15 16 17]  [30 31 32 33 34 35]  [48 49 50 51 52 53]  [66 67 68 69 70 71] │
└└───────────────────┘└───────────────────┘└───────────────────┘└───────────────────┘┘

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...