Navigation

Cython objects can expose memory buffers to Python code
by implementing the “buffer protocol”.
This chapter shows how to implement the protocol
and make use of the memory managed by an extension type from NumPy.

There are no methods to do anything productive with the matrices’ contents.
We could implement custom __getitem__, __setitem__, etc. for this,
but instead we’ll use the buffer protocol to expose the matrix’s data to Python
so we can use NumPy to do useful work.

# distutils: language = c++fromcpythoncimportPy_bufferfromlibcpp.vectorcimportvectorcdefclassMatrix:cdefPy_ssize_tncolscdefPy_ssize_tshape[2]
cdefPy_ssize_tstrides[2]
cdefvector[float] vdef__cinit__(self,Py_ssize_tncols):self.ncols=ncolsdefadd_row(self):"""Adds a row, initially zero-filled."""self.v.resize(self.v.size()+self.ncols)def__getbuffer__(self,Py_buffer*buffer,intflags):cdefPy_ssize_titemsize=sizeof(self.v[0])self.shape[0]=self.v.size()/self.ncolsself.shape[1]=self.ncols# Stride 1 is the distance, in bytes, between two items in a row;# this is the distance between two adjacent items in the vector.# Stride 0 is the distance between the first elements of adjacent rows.self.strides[1]=<Py_ssize_t>(<char*>&(self.v[1])-<char*>&(self.v[0]))self.strides[0]=self.ncols*self.strides[1]buffer.buf=<char*>&(self.v[0])buffer.format='f'# floatbuffer.internal=NULL# see Referencesbuffer.itemsize=itemsizebuffer.len=self.v.size()*itemsize# product(shape) * itemsizebuffer.ndim=2buffer.obj=selfbuffer.readonly=0buffer.shape=self.shapebuffer.strides=self.stridesbuffer.suboffsets=NULL# for pointer arrays onlydef__releasebuffer__(self,Py_buffer*buffer):pass

The method Matrix.__getbuffer__ fills a descriptor structure,
called a Py_buffer, that is defined by the Python C-API.
It contains a pointer to the actual buffer in memory,
as well as metadata about the shape of the array and the strides
(step sizes to get from one element or row to the next).
Its shape and strides members are pointers
that must point to arrays of type and size Py_ssize_t[ndim].
These arrays have to stay alive as long as any buffer views the data,
so we store them on the Matrix object as members.

The code is not yet complete, but we can already compile it
and test the basic functionality.

The Matrix class as implemented so far is unsafe.
The add_row operation can move the underlying buffer,
which invalidates any NumPy (or other) view on the data.
If you try to access values after an add_row call,
you’ll get outdated values or a segfault.

This is where __releasebuffer__ comes in.
We can add a reference count to each matrix,
and lock it for mutation whenever a view exists.

We skipped some input validation in the code.
The flags argument to __getbuffer__ comes from np.asarray
(and other clients) and is an OR of boolean flags
that describe the kind of array that is requested.
Strictly speaking, if the flags contain PyBUF_ND, PyBUF_SIMPLE,
or PyBUF_F_CONTIGUOUS, __getbuffer__ must raise a BufferError.
These macros can be cimport’d from cpython.buffer.

(The matrix-in-vector structure actually conforms to PyBUF_ND,
but that would prohibit __getbuffer__ from filling in the strides.
A single-row matrix is F-contiguous, but a larger matrix is not.)

Reference documentation is available for
Python 3
and Python 2.
The Py2 documentation also describes an older buffer protocol
that is no longer in use;
since Python 2.6, the PEP 3118 protocol has been implemented,
and the older protocol is only relevant for legacy code.