C++ - Boost - Using Boost.Python Numpy from_data()

Submitted by Mi-K on Sunday, November 24, 2019 - 1:14pm

If you are using Python then NumPy is quite interesting for manipulating arrays.

But how do we do that with C++ and Boost.Python NumPy extension?

That's what we are going to see in this Windows tutorial with an easy example of the from_data() method.

First of all

We need to install Python 3 and Boost on your computer.

So in order to have the exact same software and libraries installed in the exact same locations, I suggest to follow the 2 following tutorials:

Python: https://www.badprog.com/python-3-installing-on-windows-10
Boost: https://www.badprog.com/c-boost-setting-up-on-windows-10

How does the from_data() method works

Boost.Python NumPy extension needs the from_data() method in order to get a classical array from C++ code to convert it into a NumPy ndarray.

This method has 5 parameters:

The first element of an array;
The type of each element in this array;
The shape of the array;
The strides from this array;
The owner of this array.

So as you can see it's all about an existing array.

If some parameters are quite easy to understand, it's not necessarly the case for the shape or strides ones.

The shape

The shape is simply the size of the new array.

We need to use the tuple class for that.

The first parameter is used as the number of rows.

The second for the number of columns in each row (or if you prefer the number of elements in each row).

So in the example below we are going to create a classicArray then change its shape (how many rows and columns there are now in the new array).

Our classic array will have 3 rows with 4 columns for each row:

Our first row is { 1, 2, 3, 4 }.
The second row: { 5, 6, 7, 8 }.
And the third row: { 9, 10, 11, 12 }.

So our shape has 3 rows with 4 elements in each row.

That’s why we’ll use make_tuple(3, 4).

The strides

What are strides?

Strides are the offset between 2 elements in a row or from a row up to the next one.

In the example below we'll have the same array but with each time a new shape and new strides in order to get how they work.

To change the strides we need to use the tuple class.

For the strides we’ll use the tuple with 2 parameters.

The first will be the number of elements from the first element in the first row and the first element in the next row.

The second parameter is the offset between each element in a row.

Examples

Example 1

So for the first example in the code below, we have a shape of (3, 4) and we want to display all the elements in all the rows.

Thus we’ll have to set strides like this:

make_tuple(typeSize * 4, typeSize * 1);

Indeed, we need 1 offset to jump from an element to the next one and we need 4 offsets to jump from the first element in a row to the first element in next row.

Example 2

For the example 2 after having displayed the whole first array we kept the shape and changed the strides.

Here this is exactly the same thing.

Indeed the offset between the elements is now 2 (meaning that we’ll use one element every two elements).

And we also changed the offset from the first element of the first row to the next first element of the next row.

But we have still 4 elements in each row and want an offset of 3 for the next row.

So we’ll get the number "4" as first element of the second row because 1 (first element of the first row) + 3 (offset) is equal to the number "4" in the original array.

And we’ll get the number a wrong number as last element because we’d have needed a 13^th element in the array.

So the result is a value that is outside the original array, in our example it’s the value -858993460.

Example 3

For the example 3 we changed the shape (3, 3) and we changed also the strides.

But we wanted to display the first number of the original array, that is "1", up to the number "9".

Let's code a bit

// badprog.com
#include "pch.h"
#include <boost/python/numpy.hpp>
#include <iostream>
//
namespace python  = boost::python;
namespace numpy   = boost::python::numpy;
// ----------------------------------------------------------------------------
//
// ----------------------------------------------------------------------------
int main(int argc, char **argv) {
  //
  Py_Initialize();
  numpy::initialize();
  //
  double classicArray[3][4] = { { 1, 2, 3, 4 }, { 5, 6, 7, 8 }, { 9, 10, 11, 12 } };
  //
  int typeSize = sizeof(int);
  //
  numpy::dtype npDtype = PyFloat_Type; //numpy::dtype::get_builtin<int>();
  //
  python::tuple shape;
  python::tuple strides;
  //
  // before modifying (exemple 1)
  //
  shape   = python::make_tuple(3, 4);
  strides = python::make_tuple(typeSize * 4, typeSize * 1);
  //
  numpy::ndarray npNdarray = numpy::from_data(
    classicArray, // array of data
    npDtype, // type
    shape, // shape
    strides, // strides
    python::object() // owner object
  );
  //
  std::cout
    << "classicArray before modification: " << std::endl
    << python::extract<char const *>(python::str(npNdarray)) << std::endl << std::endl;
  //
  // after modification 1 (exemple 2)
  //
  shape   = python::make_tuple(3, 4);
  strides = python::make_tuple(typeSize * 3, typeSize * 2);
  //
  npNdarray = numpy::from_data(
    classicArray, // array of data
    npDtype, // type
    shape, // shape
    strides, // strides
    python::object() // owner object
  );
  //
  std::cout
    << "classicArray after modification 1: " << std::endl
    << python::extract<char const *>(python::str(npNdarray)) << std::endl << std::endl;
  //
  // after modification 2 (exemple 3)
  //
  shape   = python::make_tuple(3, 3);
  strides = python::make_tuple(typeSize * 3, typeSize * 1);
  //
  npNdarray = numpy::from_data(
    classicArray, // array of data
    npDtype, // type
    shape, // shape
    strides, // strides
    python::object() // owner object
  );
  //
  std::cout
    << "classicArray after modification 2: " << std::endl
    << python::extract<char const *>(python::str(npNdarray)) << std::endl << std::endl;
}

	 

Build, run and result

In your console you can see the following display:

classicArray before modification:

[[ 1  2  3  4]

 [ 5  6  7  8]

 [ 9 10 11 12]]


classicArray after modification 1:

[[         1          3          5          7]

 [         4          6          8         10]

 [         7          9         11 -858993460]]


classicArray after modification 2:

[[1 2 3]

 [4 5 6]

 [7 8 9]]

Conclusion

The from_data() method is essential in order to manipulate data from a NumPy ndarray.

The most complex part was the strides one.

So if you get it, good job, you get it all.