===============
C-API for NumPy
===============

:Author:          Travis Oliphant
:Discussions to:  `numpy-discussion@python.org`__
:Created:         October 2005

__ http://scipy.org/scipylib/mailing-lists.html

The C API of NumPy is (mostly) backward compatible with Numeric.

There are a few non-standard Numeric usages (that were not really part
of the API) that will need to be changed:

* If you used any of the function pointers in the ``PyArray_Descr``
  structure you will have to modify your usage of those.  First,
  the pointers are all under the member named ``f``.  So ``descr->cast``
  is now ``descr->f->cast``.  In addition, the
  casting functions have eliminated the strides argument (use
  ``PyArray_CastTo`` if you need strided casting). All functions have
  one or two ``PyArrayObject *`` arguments at the end.  This allows the
  flexible arrays and mis-behaved arrays to be handled.

* The ``descr->zero`` and ``descr->one`` constants have been replaced with
  function calls, ``PyArray_Zero``, and ``PyArray_One`` (be sure to read the
  code and free the resulting memory if you use these calls).

* If you passed ``array->dimensions`` and ``array->strides`` around
  to functions, you will need to fix some code. These are now
  ``npy_intp*`` pointers. On 32-bit systems there won't be a problem.
  However, on 64-bit systems, you will need to make changes to avoid
  errors and segfaults.


The header files ``arrayobject.h`` and ``ufuncobject.h`` contain many defines
that you may find useful.  The files ``__ufunc_api.h`` and
``__multiarray_api.h`` contain the available C-API function calls with
their function signatures.

All of these headers are installed to
``<YOUR_PYTHON_LOCATION>/site-packages/numpy/core/include``


Getting arrays in C-code
=========================

All new arrays can be created using ``PyArray_NewFromDescr``.  A simple interface
equivalent to ``PyArray_FromDims`` is ``PyArray_SimpleNew(nd, dims, typenum)``
and to ``PyArray_FromDimsAndData`` is
``PyArray_SimpleNewFromData(nd, dims, typenum, data)``.

This is a very flexible function.

::

  PyObject * PyArray_NewFromDescr(PyTypeObject *subtype, PyArray_Descr *descr,
                                int nd, npy_intp *dims,
                                npy_intp *strides, char *data,
                                int flags, PyObject *obj);

``subtype`` : ``PyTypeObject *``
    The subtype that should be created (either pass in
    ``&PyArray_Type``,  or ``obj->ob_type``,
    where ``obj`` is an instance of a subtype (or subclass) of
    ``PyArray_Type``).

``descr`` : ``PyArray_Descr *``
    The type descriptor for the array. This is a Python object (this
    function steals a reference to it). The easiest way to get one is
    using ``PyArray_DescrFromType(<typenum>)``. If you want to use a
    flexible size array, then you need to use
    ``PyArray_DescrNewFromType(<flexible typenum>)`` and set its ``elsize``
    parameter to the desired size. The typenum in both of these cases
    is one of the ``PyArray_XXXX`` enumerated types.

``nd`` : ``int``
    The number of dimensions (<``MAX_DIMS``)

``*dims`` : ``npy_intp *``
    A pointer to the size in each dimension. Information will be
    copied from here.

``*strides`` : ``npy_intp *``
    The strides this array should have. For new arrays created by this
    routine, this should be ``NULL``. If you pass in memory for this array
    to use, then you can pass in the strides information as well
    (otherwise it will be created for you and default to C-contiguous
    or Fortran contiguous). Any strides will be copied into the array
    structure. Do not pass in bad strides information!!!!

    ``PyArray_CheckStrides(...)`` can help but you must call it if you are
    unsure. You cannot pass in strides information when data is ``NULL``
    and this routine is creating its own memory.

``*data`` : ``char *``
    ``NULL`` for creating brand-new memory. If you want this array to wrap
    another memory area, then pass the pointer here. You are
    responsible for deleting the memory in that case, but do not do so
    until the new array object has been deleted. The best way to
    handle that is to get the memory from another Python object,
    ``INCREF`` that Python object after passing it's data pointer to this
    routine, and set the ``->base`` member of the returned array to the
    Python object. *You are responsible for* setting ``PyArray_BASE(ret)``
    to the base object. Failure to do so will create a memory leak.

    If you pass in a data buffer, the ``flags`` argument will be the flags
    of the new array. If you create a new array, a non-zero flags
    argument indicates that you want the array to be in Fortran order.

``flags`` : ``int``
    Either the flags showing how to interpret the data buffer passed
    in, or if a new array is created, nonzero to indicate a Fortran
    order array. See below for an explanation of the flags.

``obj`` : ``PyObject *``
    If subtypes is ``&PyArray_Type``, this argument is
    ignored. Otherwise, the ``__array_finalize__`` method of the subtype
    is called (if present) and passed this object. This is usually an
    array of the type to be created (so the ``__array_finalize__`` method
    must handle an array argument. But, it can be anything...)

Note: The returned array object will be uninitialized unless the type is
``PyArray_OBJECT`` in which case the memory will be set to ``NULL``.

``PyArray_SimpleNew(nd, dims, typenum)`` is a drop-in replacement for
``PyArray_FromDims`` (except it takes ``npy_intp*`` dims instead of ``int*`` dims
which matters on 64-bit systems) and it does not initialize the memory
to zero.

``PyArray_SimpleNew`` is just a macro for ``PyArray_New`` with default arguments.
Use ``PyArray_FILLWBYTE(arr, 0)``  to fill with zeros.

The ``PyArray_FromDims`` and family of functions are still available and
are loose wrappers around this function.  These functions still take
``int *`` arguments.  This should be fine on 32-bit systems, but on 64-bit
systems you may run into trouble if you frequently passed
``PyArray_FromDims`` the dimensions member of the old ``PyArrayObject`` structure
because ``sizeof(npy_intp) != sizeof(int)``.


Getting an arrayobject from an arbitrary Python object
======================================================

``PyArray_FromAny(...)``

This function replaces ``PyArray_ContiguousFromObject`` and friends (those
function calls still remain but they are loose wrappers around the
``PyArray_FromAny`` call).

::

  static PyObject *
  PyArray_FromAny(PyObject *op, PyArray_Descr *dtype, int min_depth,
  		  int max_depth, int requires, PyObject *context)


``op`` : ``PyObject *``
    The Python object to "convert" to an array object

``dtype`` : ``PyArray_Descr *``
    The desired data-type descriptor. This can be ``NULL``, if the
    descriptor should be determined by the object. Unless ``FORCECAST`` is
    present in ``flags``, this call will generate an error if the data
    type cannot be safely obtained from the object.

``min_depth`` : ``int``
    The minimum depth of array needed or 0 if doesn't matter

``max_depth`` : ``int``
    The maximum depth of array allowed or 0 if doesn't matter

``requires`` : ``int``
    A flag indicating the "requirements" of the returned array. These
    are the usual ndarray flags (see `NDArray flags`_ below). In
    addition, there are three flags used only for the ``FromAny``
    family of functions:

      - ``ENSURECOPY``: always copy the array. Returned arrays always
        have ``CONTIGUOUS``, ``ALIGNED``, and ``WRITEABLE`` set.
      - ``ENSUREARRAY``: ensure the returned array is an ndarray.
      - ``FORCECAST``: cause a cast to occur regardless of whether or
        not it is safe.

``context`` : ``PyObject *``
    If the Python object ``op`` is not a numpy array, but has an
    ``__array__`` method, context is passed as the second argument to
    that method (the first is the typecode). Almost always this
    parameter is ``NULL``.


``PyArray_ContiguousFromAny(op, typenum, min_depth, max_depth)`` is
equivalent to ``PyArray_ContiguousFromObject(...)`` (which is still
available), except it will return the subclass if op is already a
subclass of the ndarray. The ``ContiguousFromObject`` version will
always return an ndarray.

Passing Data Type information to C-code
=======================================

All datatypes are handled using the ``PyArray_Descr *`` structure.
This structure can be obtained from a Python object using
``PyArray_DescrConverter`` and ``PyArray_DescrConverter2``.  The former
returns the default ``PyArray_LONG`` descriptor when the input object
is None, while the latter returns ``NULL`` when the input object is ``None``.

See the ``arraymethods.c`` and ``multiarraymodule.c`` files for many
examples of usage.

Getting at the structure of the array.
--------------------------------------

You should use the ``#defines`` provided to access array structure portions:

- ``PyArray_DATA(obj)`` : returns a ``void *`` to the array data
- ``PyArray_BYTES(obj)`` : return a ``char *`` to the array data
- ``PyArray_ITEMSIZE(obj)``
- ``PyArray_NDIM(obj)``
- ``PyArray_DIMS(obj)``
- ``PyArray_DIM(obj, n)``
- ``PyArray_STRIDES(obj)``
- ``PyArray_STRIDE(obj,n)``
- ``PyArray_DESCR(obj)``
- ``PyArray_BASE(obj)``

see more in ``arrayobject.h``


NDArray Flags
=============

The ``flags`` attribute of the ``PyArrayObject`` structure contains important
information about the memory used by the array (pointed to by the data member)
This flags information must be kept accurate or strange results and even
segfaults may result.

There are 6 (binary) flags that describe the memory area used by the
data buffer.  These constants are defined in ``arrayobject.h`` and
determine the bit-position of the flag.  Python exposes a nice attribute-
based interface as well as a dictionary-like interface for getting
(and, if appropriate, setting) these flags.

Memory areas of all kinds can be pointed to by an ndarray, necessitating
these flags.  If you get an arbitrary ``PyArrayObject`` in C-code,
you need to be aware of the flags that are set.
If you need to guarantee a certain kind of array
(like ``NPY_CONTIGUOUS`` and ``NPY_BEHAVED``), then pass these requirements into the
PyArray_FromAny function.


``NPY_CONTIGUOUS``
    True if the array is (C-style) contiguous in memory.
``NPY_FORTRAN``
    True if the array is (Fortran-style) contiguous in memory.

Notice that contiguous 1-d arrays are always both ``NPY_FORTRAN`` contiguous
and C contiguous. Both of these flags can be checked and are convenience
flags only as whether or not an array is ``NPY_CONTIGUOUS`` or ``NPY_FORTRAN``
can be determined by the ``strides``, ``dimensions``, and ``itemsize``
attributes.

``NPY_OWNDATA``
    True if the array owns the memory (it will try and free it using
    ``PyDataMem_FREE()`` on deallocation --- so it better really own it).

These three flags facilitate using a data pointer that is a memory-mapped
array, or part of some larger record array.  But, they may have other uses...

``NPY_ALIGNED``
    True if the data buffer is aligned for the type and the strides
    are multiples of the alignment factor as well.  This can be
    checked.

``NPY_WRITEABLE``
    True only if the data buffer can be "written" to.

``NPY_WRITEBACKIFCOPY``
    This is a special flag that is set if this array represents a copy
    made because a user required certain flags in ``PyArray_FromAny`` and
    a copy had to be made of some other array (and the user asked for
    this flag to be set in such a situation). The base attribute then
    points to the "misbehaved" array (which is set read_only). If you use
    this flag, you are must call ``PyArray_ResolveWritebackIfCopy`` before
    deallocating this array (i.e. before calling ``Py_DECREF`` the last time)
    which will write the data contents back to the "misbehaved" array (casting
    if necessary) and will reset the "misbehaved" array to ``WRITEABLE``. If
    the "misbehaved" array was not ``WRITEABLE`` to begin with then
    ``PyArray_FromAny`` would have returned an error because ``WRITEBACKIFCOPY``
    would not have been possible. In error conditions, call
    ``PyArray_DiscardWritebackIfCopy`` to throw away the scratch buffer, then
    ``Py_DECREF`` or ``Py_XDECREF``.

``NPY_UPDATEIFCOPY``
    Similar to ``NPY_WRITEBACKIFCOPY``, but deprecated since it copied the
    contents back when the array is deallocated, which is not explicit and
    relies on refcount semantics. Refcount semantics are unreliable on
    alternative implementations of python such as PyPy.

``PyArray_UpdateFlags(obj, flags)`` will update the ``obj->flags`` for
``flags`` which can be any of ``NPY_CONTIGUOUS``, ``NPY_FORTRAN``, ``NPY_ALIGNED``, or
``NPY_WRITEABLE``.

Some useful combinations of these flags:

- ``NPY_BEHAVED = NPY_ALIGNED | NPY_WRITEABLE``
- ``NPY_CARRAY = NPY_DEFAULT = NPY_CONTIGUOUS | NPY_BEHAVED``
- ``NPY_CARRAY_RO = NPY_CONTIGUOUS | NPY_ALIGNED``
- ``NPY_FARRAY = NPY_FORTRAN | NPY_BEHAVED``
- ``NPY_FARRAY_RO = NPY_FORTRAN | NPY_ALIGNED``

The macro ``PyArray_CHECKFLAGS(obj, flags)``  can test any combination of flags.
There are several default combinations defined as macros already
(see ``arrayobject.h``)

In particular, there are ``ISBEHAVED``, ``ISBEHAVED_RO``, ``ISCARRAY``
and ``ISFARRAY`` macros that also check to make sure the array is in
native byte order (as determined) by the data-type descriptor.

There are more C-API enhancements which you can discover in the code,
or buy the book (http://www.trelgol.com)