Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
250 views
in Technique[技术] by (71.8m points)

SWIG interfacing C library to Python (Creating 'iterable' Python data type from C 'sequence' struct)

I have written a Python extension for a C library. I have a data structure that looks like this:

typedef struct _mystruct{
   double * clientdata;
   size_t   len;
} MyStruct;

The purpose of this datatype maps directly to the list data type in Python. I therefore, want to create 'list-like' behavior for the exported struct, so that code written using my C extension is more 'Pythonic'.

In particular, this is what I want to be able to do (from python code) Note: py_ctsruct is a ctsruct datatype being accessed in python.

My requirements can be sumarized as:

  1. list(py_ctsruct) returns a python list with all contents copied out from the c struct
  2. py_cstruct[i] returns ith element (preferably throws IndexError on invalid index)
  3. for elem in py_ctsruct: ability to enumerate

According to PEP234, An object can be iterated over with "for" if it implements _iter_() or _getitem_(). Using that logic then, I think that by adding the following attributes (via rename) to my SWIG interface file, I will have the desired behavior (apart from req. #1 above - which I still dont know how to achieve):

__len__
__getitem__
__setitem__

I am now able to index the C object in python. I have not yet implemented the Python exception throwing, however if array bounds are exceeded, are return a magic number (error code).

The interesting thing is that when I attempt to iterate over the struct using 'for x in' syntax for example:

for i in py_cstruct:
    print i

Python enters into an infinite loop that simply prints the magic (error) number mentioned above, on the console. which suggests to me that there is something wrong with the indexing.

last but not the least, how can I implement requirement 1? this involves (as I understand it):

  • handling' the function call list() from python
  • Returning a Python (list) data type from C code

[[Update]]

I would be interested in seeing a little code snippet on what (if any) declarations I need to put in my interface file, so that I can iterate over the elements of the c struct, from Python.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The simplest solution to this is to implement __getitem__ and throw an IndexError exception for an invalid index.

I put together an example of this, using %extend and %exception in SWIG to implement __getitem__ and raise an exception respectively:

%module test

%include "exception.i"

%{
#include <assert.h>
#include "test.h"
static int myErr = 0; // flag to save error state
%}

%exception MyStruct::__getitem__ {
  assert(!myErr);
  $action
  if (myErr) {
    myErr = 0; // clear flag for next time
    // You could also check the value in $result, but it's a PyObject here
    SWIG_exception(SWIG_IndexError, "Index out of bounds");
  }
}

%include "test.h"

%extend MyStruct {
  double __getitem__(size_t i) {
    if (i >= $self->len) {
      myErr = 1;
      return 0;
    }
    return $self->clientdata[i];
  }
}

I tested it by adding to test.h:

static MyStruct *test() {
  static MyStruct inst = {0,0};
  if (!inst.clientdata) {
    inst.len = 10;
    inst.clientdata = malloc(sizeof(double)*inst.len);
    for (size_t i = 0; i < inst.len; ++i) {
      inst.clientdata[i] = i;
    }
  }
  return &inst;
}

And running the following Python:

import test

for i in test.test():
  print i

Which prints:

python run.py
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0

and then finishes.


An alternative approach, using a typemap to map MyStruct onto a PyList directly is possible too:

%module test

%{
#include "test.h"
%}

%typemap(out) (MyStruct *) {
  PyObject *list = PyList_New($1->len);
  for (size_t i = 0; i < $1->len; ++i) {
    PyList_SetItem(list, i, PyFloat_FromDouble($1->clientdata[i]));
  }

  $result = list;
}

%include "test.h"

This will create a PyList with the return value from any function that returns a MyStruct *. I tested this %typemap(out) with the exact same function as the previous method.

You can also write a corresponding %typemap(in) and %typemap(freearg) for the reverse, something like this untested code:

%typemap(in) (MyStruct *) {
  if (!PyList_Check($input)) {
    SWIG_exception(SWIG_TypeError, "Expecting a PyList");
    return NULL;
  }
  MyStruct *tmp = malloc(sizeof(MyStruct));
  tmp->len = PyList_Size($input);
  tmp->clientdata = malloc(sizeof(double) * tmp->len);
  for (size_t i = 0; i < tmp->len; ++i) {
    tmp->clientdata[i] = PyFloat_AsDouble(PyList_GetItem($input, i));
    if (PyErr_Occured()) {
      free(tmp->clientdata);
      free(tmp);
      SWIG_exception(SWIG_TypeError, "Expecting a double");
      return NULL;
    }
  }
  $1 = tmp;
}

%typemap(freearg) (MyStruct *) {
  free($1->clientdata);
  free($1);
}

Using an iterator would make more sense for containers like linked lists, but for completeness sake here's how you might go about doing it for MyStruct with __iter__. The key bit is that you get SWIG to wrap another type for you, which provides the __iter__() and next() needed, in this case MyStructIter which is defined and wrapped at the same time using %inline since it's not part of the normal C API:

%module test

%include "exception.i"

%{
#include <assert.h>
#include "test.h"
static int myErr = 0;
%}

%exception MyStructIter::next {
  assert(!myErr);
  $action
  if (myErr) {
    myErr = 0; // clear flag for next time
    PyErr_SetString(PyExc_StopIteration, "End of iterator");
    return NULL;
  }
}

%inline %{
  struct MyStructIter {
    double *ptr;
    size_t len;
  };
%}

%include "test.h"

%extend MyStructIter {
  struct MyStructIter *__iter__() {
    return $self;
  }

  double next() {
    if ($self->len--) {
      return *$self->ptr++;
    }
    myErr = 1;
    return 0;
  }
}

%extend MyStruct {
  struct MyStructIter __iter__() {
    struct MyStructIter ret = { $self->clientdata, $self->len };
    return ret;
  }
}

The requirements for iteration over containers are such that the container needs to implement __iter__() and return a new iterator, but in addition to next() which returns the next item and increments the iterator the iterator itself must also supply a __iter__() method. This means that either the container or an iterator can be used identically.

MyStructIter needs to keep track of the current state of iteration - where we are and how much we have left. In this example I did that by keeping a pointer to the next item and a counter that we use to tell when we hit the end. You could also have kept track of the sate by keeping a pointer to the MyStruct the iterator is using and a counter for the position within that, something like:

%inline %{
  struct MyStructIter {
    MyStruct *list;
    size_t pos;
  };
%}

%include "test.h"

%extend MyStructIter {
  struct MyStructIter *__iter__() {
    return $self;
  }

  double next() {
    if ($self->pos < $self->list->len) {
      return $self->list->clientdata[$self->pos++];
    }
    myErr = 1;
    return 0;
  }
}

%extend MyStruct {
  struct MyStructIter __iter__() {
    struct MyStructIter ret = { $self, 0 };
    return ret;
  }
}

(In this instance we could actually have just used the container itself as the iterator as an iterator, by supplying an __iter__() that returned a copy of the container and a next() similar to the first type. I didn't do that in my original answer because I thought that would be less clear than have two distinct types - a container and an iterator for that container)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...