NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.
Creating an array with dtype=object
is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list
is really just a list of pointers to objects, not the objects themselves).
Arithmetic operators such as *
don't work with arrays such as ar1
which have a string_
datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the *
operator doesn't make sense here. However, the line
np.array(['avinash','jay'], dtype=object) * 2
works because now the array is an array of (pointers to) Python strings. The *
operator is well defined for these Python string objects. New Python strings are created in memory and a new object
array with references to the new strings is returned.
If you have an array with string_
or unicode_
dtype and want to repeat each string, you can use np.char.multiply
:
In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'],
dtype='<U14')
NumPy has many other vectorised string methods too.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…