Quick Fix (read on for more details and a more sophisticated approach):
You need to initialize the variable PyArray_API
in every cpp-file in which you are using numpy-stuff by calling import_array()
:
//it is only a trick to ensure import_array() is called, when *.so is loaded
//just called only once
int init_numpy(){
import_array(); // PyError if not successful
return 0;
}
const static int numpy_initialized = init_numpy();
void parse_ndarraray(PyObject *obj) { // would be called every time
if (PyArray_Check(obj)) {
cout << "PyArray_Check Passed" << endl;
} else {
cout << "PyArray_Check Failed" << endl;
}
}
One could also use _import_array
, which returns a negative number if not successful, to use a custom error handling. See here for definition of import_array
.
Warning: As pointed out by @isra60, _import_array()/import_array()
can only be called, once Python is initialized, i.e. after Py_Initialize()
was called. This is always the case for an extension, but not always the case if the python interpreter is embedded, because numpy_initialized
is initialized before main
-starts. In this case, "the initialization trick" should not be used but init_numpy()
called after Py_Initialize()
.
Sophisticated solution:
NB: For information, why setting PyArray_API
is needed, see this SO-answer: in order to be able to postpone resolution of symbols until running time, so numpy's shared object aren't needed at link time and must not be on dynamic-library-path (python's system path is enough then).
The proposed solution is quick, but if there are more than one cpp using numpy, one have a lot of instances of PyArray_API initialized.
This can be avoided if PyArray_API
isn't defined as static but as extern
in all but one translation unit. For those translation units NO_IMPORT_ARRAY
macro must be defined before numpy/arrayobject.h
is included.
We need however a translation unit in which this symbol is defined. For this translation unit the macro NO_IMPORT_ARRAY
must not be defined.
However, without defining the macro PY_ARRAY_UNIQUE_SYMBOL
we will get only a static symbol, i.e. not visible for other translations unit, thus the linker will fail. The reason for that: if there are two libraries and everyone defines a PyArray_API
then we would have a multiple definition of a symbol and the linker will fail, i.e. we cannot use these both libraries together.
Thus, by defining PY_ARRAY_UNIQUE_SYMBOL
as MY_FANCY_LIB_PyArray_API
prior to every include of numpy/arrayobject.h
we would have our own PyArray_API
-name, which would not clash with other libraries.
Putting it all together:
A: use_numpy.h - your header for including numpy-functionality i.e. numpy/arrayobject.h
//use_numpy.h
//your fancy name for the dedicated PyArray_API-symbol
#define PY_ARRAY_UNIQUE_SYMBOL MY_PyArray_API
//this macro must be defined for the translation unit
#ifndef INIT_NUMPY_ARRAY_CPP
#define NO_IMPORT_ARRAY //for usual translation units
#endif
//now, everything is setup, just include the numpy-arrays:
#include <numpy/arrayobject.h>
B: init_numpy_api.cpp
- a translation unit for initializing of the global MY_PyArray_API
:
//init_numpy_api.cpp
//first make clear, here we initialize the MY_PyArray_API
#define INIT_NUMPY_ARRAY_CPP
//now include the arrayobject.h, which defines
//void **MyPyArray_API
#inlcude "use_numpy.h"
//now the old trick with initialization:
int init_numpy(){
import_array();// PyError if not successful
return 0;
}
const static int numpy_initialized = init_numpy();
C: just include use_numpy.h
whenever you need numpy, it will define extern void **MyPyArray_API
:
//example
#include "use_numpy.h"
...
PyArray_Check(obj); // works, no segmentation error
Warning: It should not be forgotten, that for initialization-trick to work, Py_Initialize()
must be already called.
Why do you need it (kept for historical reasons):
When I build your extension with debug symbols:
extra_compile_args=['-fPIC', '-O0', '-g'],
extra_link_args=['-O0', '-g'],
and run it with gdb:
gdb --args python run_test.py
(gdb) run
--- Segmentation fault
(gdb) disass
I can see the following:
0x00007ffff1d2a6d9 <+20>: mov 0x203260(%rip),%rax
# 0x7ffff1f2d940 <_ZL11PyArray_API>
0x00007ffff1d2a6e0 <+27>: add $0x10,%rax
=> 0x00007ffff1d2a6e4 <+31>: mov (%rax),%rax
...
(gdb) print $rax
$1 = 16
We should keep in mind, that PyArray_Check
is only a define for:
#define PyArray_Check(op) PyObject_TypeCheck(op, &PyArray_Type)
That seems, that &PyArray_Type
uses somehow a part of PyArray_API
which is not initialized (has value 0
).
Let's take a look at the cpp_parser.cpp
after the preprocessor (compiled with flag -E
:
static void **PyArray_API= __null
...
static int
_import_array(void)
{
PyArray_API = (void **)PyCapsule_GetPointer(c_api,...
So PyArray_AP
I is static and is initialized via _import_array(void)
, that actually would explain the warning I get during the build, that _import_array()
was defined but not used - we didn't initialize PyArray_API
.
Because PyArray_API
is a static variable it must be initialized in every compilation unit i.e. cpp - file.
So we just need to do it - import_array()
seems to be the official way.