array_split documenation

Release:0.3.0
Version:0.3.0
Date:May 30, 2017
array_split python package Build Status Documentation Status Coveralls Status MIT License array_split python package

The array_split python package is a modest enhancement to the numpy.array_split function for sub-dividing multi-dimensional arrays into sub-arrays (slices). The main motivation comes from parallel processing where one desires to split (decompose) a large array (or multiple arrays) into smaller sub-arrays which can be processed concurrently by other processes (multiprocessing or mpi4py) or other memory-limited hardware (e.g. GPGPU using pyopencl, pycuda, etc).

Quick Start Example

>>> from array_split import array_split, shape_split
>>> import numpy as np
>>>
>>> ary = np.arange(0, 4*9)
>>>
>>> array_split(ary, 4) # 1D split into 4 sections (like numpy.array_split)
[array([0, 1, 2, 3, 4, 5, 6, 7, 8]),
 array([ 9, 10, 11, 12, 13, 14, 15, 16, 17]),
 array([18, 19, 20, 21, 22, 23, 24, 25, 26]),
 array([27, 28, 29, 30, 31, 32, 33, 34, 35])]
>>>
>>> shape_split(ary.shape, 4) # 1D split into 4 sections, slice objects instead of numpy.ndarray views
array([(slice(0, 9, None),), (slice(9, 18, None),), (slice(18, 27, None),), (slice(27, 36, None),)],
      dtype=[('0', 'O')])
>>>
>>> ary = ary.reshape(4, 9) # Make ary 2D
>>> split = shape_split(ary.shape, axis=(2, 3)) # 2D split into 2*3=6 sections
>>> split.shape
(2, 3)
>>> split
array([[(slice(0, 2, None), slice(0, 3, None)),
        (slice(0, 2, None), slice(3, 6, None)),
        (slice(0, 2, None), slice(6, 9, None))],
       [(slice(2, 4, None), slice(0, 3, None)),
        (slice(2, 4, None), slice(3, 6, None)),
        (slice(2, 4, None), slice(6, 9, None))]],
      dtype=[('0', 'O'), ('1', 'O')])
>>> sub_arys = [ary[tup] for tup in split.flatten()] # Split ary into sub-array views using the slice tuples.
>>> sub_arys
[array([[ 0,  1,  2], [ 9, 10, 11]]),
 array([[ 3,  4,  5], [12, 13, 14]]),
 array([[ 6,  7,  8], [15, 16, 17]]),
 array([[18, 19, 20], [27, 28, 29]]),
 array([[21, 22, 23], [30, 31, 32]]),
 array([[24, 25, 26], [33, 34, 35]])]

Latest sphinx documentation examples at http://array-split.readthedocs.io/en/latest/examples/.

Installation

Using pip:

pip install array_split # with root access

or:

pip install --user array_split # no root/sudo permissions required

From latest github source:

git clone https://github.com/array-split/array_split.git
cd array_split
python setup.py install --user

Requirements

Requires numpy version >= 1.6, python-2 version >= 2.6 or python-3 version >= 3.2.

Testing

Run tests (unit-tests and doctest module docstring tests) using:

python -m array_split.tests

or, from the source tree, run:

python setup.py test

Travis CI at:

Documentation

Latest sphinx generated documentation is at:

and at github gh-pages:

Sphinx documentation can be built from the source:

python setup.py build_sphinx

with the HTML generated in docs/build/html.

Latest source code

Source at github:

License information

See the file LICENSE.txt for terms & conditions, for usage and a DISCLAIMER OF ALL WARRANTIES.

Terminology

Definitions:

tile
A multi-dimensional sub-array of an array (e.g. numpy.ndarray) decomposition.
slice
A tuple of slice elements defining the extents of a tile/sub-array.
cut
A division along an axis to form tiles or slices.
split
The sub-division (tiling) of an array (or an array shape) resulting from cuts.
halo
Per-axis number of elements which specifies the expansion of a tile (in the negative and positive axis directions) to form an overlap of elements with neighbouring tiles. The overlaps are often referred to as ghost cells or ghost elements.
sub-tile
A sub-array of a tile.

Parameter Categories

There are four categories of parameters for specifying a split:

Number of tiles
The total number of tiles and/or the number of slices per axis. The indices_or_sections parameter can specify the number of tiles in the resulting split (as an int).
Per-axis split indices
The per-axis indices specifying where the array (shape) is to be cut. The indices_or_sections parameter doubles up to indicate the indices at which cuts are to occur.
Tile shape
Explicitly specify the shape of the tile in a split. The tile_shape parameter (typically as a lone keyword argument) indicates the tile shape.
Tile maximum number of bytes
Given the number of bytes per array element, a tile shape is calculated such that all tiles (including halo extension) of the resulting split do not exceed a specified (maximum) number of bytes. The array_itemsize parameter gives the number of bytes per array element and the max_tile_bytes parameter constrains the maximum number of bytes per tile.

The subsequent sections provides examples from each of these categories.

Import statements for the examples

In the examples of the following sections, we assume that the following statement has been issued to import the relevant functions:

>>> import numpy
>>> from array_split import array_split, shape_split, ShapeSplitter

Comparison between array_split, shape_split and ShapeSplitter

The array_split.array_split() function is analogous to the numpy.array_split() function. It takes a numpy.ndarray object as an argument and returns a list of tile (numpy.ndarray sub-array objects) elements:

>>> numpy.array_split(numpy.arange(0, 10), 3)
[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
>>> array_split(numpy.arange(0, 10), 3) # array_split.array_split
[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]

The array_split.shape_split() function takes an array shape as an argument instead of an actual array, and returns a numpy structured array of tuple elements. The tuple elements can then be used to generate the tiles from a numpy.ndarray of an equivalent shape:

>>> ary = numpy.arange(0, 10)
>>> split = shape_split(ary.shape, 3) # returns array of tuples
>>> split
array([(slice(0, 4, None),), (slice(4, 7, None),), (slice(7, 10, None),)],
      dtype=[('0', 'O')])
>>> [ary[slyce] for slyce in split.flatten()] # generates tile views of ary
[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]

Each tuple array element, of the returned split, has length equal to the dimension of the multi-dimensional shape, i.e. N = len(array_shape). Each tuple indicates the indexing extent of a tile.

The array_split.ShapeSplitter class contains the bulk of the split implementation for the array_split.shape_split(). The array_split.ShapeSplitter.__init__() constructor takes the same arguments as the array_split.shape_split() function and the array_split.ShapeSplitter.calculate_split() method computes the split. After the split computation, some state information is preserved in the array_split.ShapeSplitter data attributes:

>>> ary = numpy.arange(0, 10)
>>> splitter = ShapeSplitter(ary.shape, 3)
>>> split = splitter.calculate_split()
>>> split.shape
(3,)
>>> split
array([(slice(0, 4, None),), (slice(4, 7, None),), (slice(7, 10, None),)],
      dtype=[('0', 'O')])
>>> [ary[slyce] for slyce in split.flatten()]
[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
>>>
>>> splitter.split_shape # equivalent to split.shape above
array([3])
>>> splitter.split_begs  # start indices for tile extents
[array([0, 4, 7])]
>>> splitter.split_ends  # stop indices for tile extents
[array([ 4,  7, 10])]

Methods of the array_split.ShapeSplitter class can be over-ridden in sub-classes in order to customise the splitting behaviour.

The examples of the following section explicitly illustrate the behaviour for the array_split.shape_split() function, but with minor modifications, the examples are also relevant for the array_split.array_split() function and for instances of the array_split.ShapeSplitter class.

Splitting by number of tiles

Single axis number of tiles

When the indices_or_sections parameter is specified as an integer (scalar), it specifies the number of tiles in the returned split:

>>> split = shape_split([20,], 4)  # 1D, array_shape=[20,], number of tiles=4, default axis=0
>>> split.shape
(4,)
>>> split
array([(slice(0, 5, None),), (slice(5, 10, None),), (slice(10, 15, None),),
       (slice(15, 20, None),)],
      dtype=[('0', 'O')])

By default, cuts are made along the axis = 0 axis. In the multi-dimensional case, one can over-ride the axis using the axis parameter, e.g. for a 2D shape:

>>> split = shape_split([20,10], 4, axis=1)  # Split along axis=1
>>> split.shape
(1, 4)
>>> split
array([[(slice(0, 20, None), slice(0, 3, None)),
        (slice(0, 20, None), slice(3, 6, None)),
        (slice(0, 20, None), slice(6, 8, None)),
        (slice(0, 20, None), slice(8, 10, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

Multiple axes number of tiles

The axis parameter can also be used to specify the number of slices (sections) per-axis:

>>> split = shape_split([20, 10], axis=[3, 2])  # Cut into 3*2=6 tiles
>>> split.shape
(3, 2)
>>> split
array([[(slice(0, 7, None), slice(0, 5, None)),
        (slice(0, 7, None), slice(5, 10, None))],
       [(slice(7, 14, None), slice(0, 5, None)),
        (slice(7, 14, None), slice(5, 10, None))],
       [(slice(14, 20, None), slice(0, 5, None)),
        (slice(14, 20, None), slice(5, 10, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

The array axis 0 has been cut into three sections and axis 1 has been cut into two sections for a total of 3*2 = 6 tiles. In general, if axis is an integer (scalar) it indicates the single axis which is to be cut to form slices. When axis is a sequence, then axis[i] indicates the number of sections into which axis i is to be cut.

In addition, one can also specify a total number of tiles and use the axis parameter to limit which axes are to be cut by specifying non-positive values for elements of the axis sequence. For example, in 3D, cut into 8 tiles, but only cut the axis=1 and axis=2 axes:

>>> split = shape_split([20, 10, 15], 8, axis=[1, 0, 0])  # Cut into 1*?*?=8 tiles
>>> split.shape
(1, 4, 2)
>>> split
array([[[(slice(0, 20, None), slice(0, 3, None), slice(0, 8, None)),
         (slice(0, 20, None), slice(0, 3, None), slice(8, 15, None))],
        [(slice(0, 20, None), slice(3, 6, None), slice(0, 8, None)),
         (slice(0, 20, None), slice(3, 6, None), slice(8, 15, None))],
        [(slice(0, 20, None), slice(6, 8, None), slice(0, 8, None)),
         (slice(0, 20, None), slice(6, 8, None), slice(8, 15, None))],
        [(slice(0, 20, None), slice(8, 10, None), slice(0, 8, None)),
         (slice(0, 20, None), slice(8, 10, None), slice(8, 15, None))]]],
      dtype=[('0', 'O'), ('1', 'O'), ('2', 'O')])

In the above, non-positive elements of axis are replaced with positive values such that numpy.product(axis) equals the number of requested tiles (= 8 above). Raises ValueError if the impossible is attempted:

>>> try:
...     split = shape_split([20, 10, 15], 8, axis=[1, 3, 0])  # Impossible to cut into 1*3*?=8 tiles
... except (ValueError,) as e:
...     e
...
ValueError('Unable to construct grid of num_slices=8 elements from num_slices_per_axis=[1, 3, 0] (with max_slices_per_axis=[20 10 15])',)

Splitting by per-axis cut indices

Single axis cut indices

The indices_or_sections parameter can also be used to specify the location (index values) of cuts:

>>> split = shape_split([20,], [5, 7, 9])  # 1D, split into 4 tiles, default cut axis=0
>>> split.shape
(4,)
>>> split
array([(slice(0, 5, None),), (slice(5, 7, None),), (slice(7, 9, None),),
       (slice(9, 20, None),)],
      dtype=[('0', 'O')])

Here, three cuts have been made to form 4 slices, cuts at index 5, index 7 and index 9.

Similarly, in 2D, the indices_or_sections cut indices can made along axis = 1 only:

>>> split = shape_split([20, 13], [5, 7, 9], axis=1)  # 2D, cut into 4 tiles, cut axis=1
>>> split.shape
(1, 4)
>>> split
array([[(slice(0, 20, None), slice(0, 5, None)),
        (slice(0, 20, None), slice(5, 7, None)),
        (slice(0, 20, None), slice(7, 9, None)),
        (slice(0, 20, None), slice(9, 13, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

Multiple axes cut indices

The indices_or_sections parameter can also be used to cut along multiple axes. In this case, the indices_or_sections parameter is specified as a sequence of sequence, so that indices_or_sections[i] specifies the cut indices along axis i. For example, in 3D, cut along axis=1 and axis=2 only:

>>> split = shape_split([20, 13, 64], [[], [7], [15, 30, 45]])  # 3D, split into 8 tiles, no cuts on axis=0
>>> split.shape
(1, 2, 4)
>>> split
array([[[(slice(0, 20, None), slice(0, 7, None), slice(0, 15, None)),
         (slice(0, 20, None), slice(0, 7, None), slice(15, 30, None)),
         (slice(0, 20, None), slice(0, 7, None), slice(30, 45, None)),
         (slice(0, 20, None), slice(0, 7, None), slice(45, 64, None))],
        [(slice(0, 20, None), slice(7, 13, None), slice(0, 15, None)),
         (slice(0, 20, None), slice(7, 13, None), slice(15, 30, None)),
         (slice(0, 20, None), slice(7, 13, None), slice(30, 45, None)),
         (slice(0, 20, None), slice(7, 13, None), slice(45, 64, None))]]],
      dtype=[('0', 'O'), ('1', 'O'), ('2', 'O')])

The indices_or_sections=[[], [7], [15, 30, 45]] parameter indicates that the cut indices for axis=0 are [] (i.e. no cuts), the cut indices for axis=1 are [7] (a single cut at index 7) and the cut indices for axis=2 are [15, 30, 45] (three cuts).

Splitting by tile shape

The tile shape can be explicitly set with the tile_shape parameter, e.g. in 1D:

>>> split = shape_split([20,], tile_shape=[6,])  # Cut into (6,) shaped tiles
>>> split.shape
(4,)
>>> split
array([(slice(0, 6, None),), (slice(6, 12, None),), (slice(12, 18, None),),
       (slice(18, 20, None),)],
      dtype=[('0', 'O')])

and 2D:

>>> split = shape_split([20, 32], tile_shape=[6, 16])  # Cut into (6, 16) shaped tiles
>>> split.shape
(4, 2)
>>> split
array([[(slice(0, 6, None), slice(0, 16, None)),
        (slice(0, 6, None), slice(16, 32, None))],
       [(slice(6, 12, None), slice(0, 16, None)),
        (slice(6, 12, None), slice(16, 32, None))],
       [(slice(12, 18, None), slice(0, 16, None)),
        (slice(12, 18, None), slice(16, 32, None))],
       [(slice(18, 20, None), slice(0, 16, None)),
        (slice(18, 20, None), slice(16, 32, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

Splitting by maximum bytes per tile

Tile shape can constrained by specifying a maximum number of bytes per tile by specifying the array_itemsize and the max_tile_bytes parameters. In 1D:

>>> split = shape_split(
...   array_shape=[512,],
...   array_itemsize=1,
...   max_tile_bytes=512 # Equals number of array bytes
... )
...
>>> split.shape
(1,)
>>> split
array([(slice(0, 512, None),)],
      dtype=[('0', 'O')])

Double the array per-element number of bytes:

>>> split = shape_split(
...   array_shape=[512,],
...   array_itemsize=2,
...   max_tile_bytes=512 # Equals half the number of array bytes
... )
...
>>> split.shape
(2,)
>>> split
array([(slice(0, 256, None),), (slice(256, 512, None),)],
      dtype=[('0', 'O')])

Decrement max_tile_bytes to 511 to split into 3 tiles:

>>> split = shape_split(
...   array_shape=[512,],
...   array_itemsize=2,
...   max_tile_bytes=511 # Less than half the number of array bytes
... )
...
>>> split.shape
(3,)
>>> split
array([(slice(0, 171, None),), (slice(171, 342, None),),
       (slice(342, 512, None),)],
      dtype=[('0', 'O')])

Note that the split is calculated so that tiles are approximately equal in size.

In 2D:

>>> split = shape_split(
...   array_shape=[512, 1024],
...   array_itemsize=1,
...   max_tile_bytes=512*512
... )
...
>>> split.shape
(2, 1)
>>> split
array([[(slice(0, 256, None), slice(0, 1024, None))],
       [(slice(256, 512, None), slice(0, 1024, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

and increasing array_itemsize to 4:

>>> split = shape_split(
...   array_shape=[512, 1024],
...   array_itemsize=4,
...   max_tile_bytes=512*512
... )
...
>>> split.shape
(8, 1)
>>> split
array([[(slice(0, 64, None), slice(0, 1024, None))],
       [(slice(64, 128, None), slice(0, 1024, None))],
       [(slice(128, 192, None), slice(0, 1024, None))],
       [(slice(192, 256, None), slice(0, 1024, None))],
       [(slice(256, 320, None), slice(0, 1024, None))],
       [(slice(320, 384, None), slice(0, 1024, None))],
       [(slice(384, 448, None), slice(0, 1024, None))],
       [(slice(448, 512, None), slice(0, 1024, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

The preference is to cut into ('C' order) contiguous memory tiles.

Tile shape upper bound constraint

The split can be influenced by specifying the max_tile_shape parameter. For the previous 2D example, cuts can for forced along axis=1 by constraining the tile shape:

>>> split = shape_split(
...   array_shape=[512, 1024],
...   array_itemsize=4,
...   max_tile_bytes=512*512,
...   max_tile_shape=[numpy.inf, 256]
... )
...
>>> split.shape
(2, 4)
>>> split
array([[(slice(0, 256, None), slice(0, 256, None)),
        (slice(0, 256, None), slice(256, 512, None)),
        (slice(0, 256, None), slice(512, 768, None)),
        (slice(0, 256, None), slice(768, 1024, None))],
       [(slice(256, 512, None), slice(0, 256, None)),
        (slice(256, 512, None), slice(256, 512, None)),
        (slice(256, 512, None), slice(512, 768, None)),
        (slice(256, 512, None), slice(768, 1024, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

Sub-tile shape constraint

The split can also be influenced by specifying the sub_tile_shape parameter which forces the tile shape to be an even multiple of the sub_tile_shape:

>>> split = shape_split(
...   array_shape=[512, 1024],
...   array_itemsize=4,
...   max_tile_bytes=512*512,
...   max_tile_shape=[numpy.inf, 256],
...   sub_tile_shape=(15, 10)
... )
...
>>> split.shape
(3, 5)
>>> split
array([[(slice(0, 180, None), slice(0, 210, None)),
        (slice(0, 180, None), slice(210, 420, None)),
        (slice(0, 180, None), slice(420, 630, None)),
        (slice(0, 180, None), slice(630, 840, None)),
        (slice(0, 180, None), slice(840, 1024, None))],
       [(slice(180, 360, None), slice(0, 210, None)),
        (slice(180, 360, None), slice(210, 420, None)),
        (slice(180, 360, None), slice(420, 630, None)),
        (slice(180, 360, None), slice(630, 840, None)),
        (slice(180, 360, None), slice(840, 1024, None))],
       [(slice(360, 512, None), slice(0, 210, None)),
        (slice(360, 512, None), slice(210, 420, None)),
        (slice(360, 512, None), slice(420, 630, None)),
        (slice(360, 512, None), slice(630, 840, None)),
        (slice(360, 512, None), slice(840, 1024, None))]],
      dtype=[('0', 'O'), ('1', 'O')])

The array_start parameter

The array_start argument to the array_split.shape_split() function and the array_split.ShapeSplitter.__init__() constructor specifies an index offset for the slices in the returned tuple of slice objects:

>>> split = shape_split((15,), 3)
>>> split
array([(slice(0, 5, None),), (slice(5, 10, None),), (slice(10, 15, None),)],
      dtype=[('0', 'O')])
>>> split = shape_split((15,), 3, array_start=(20,))
>>> split
array([(slice(20, 25, None),), (slice(25, 30, None),),
       (slice(30, 35, None),)],
      dtype=[('0', 'O')])

The halo parameter

The halo parameter can be used to generate tiles which overlap with neighbouring tiles by a specified number of array elements (in each axis direction):

>>> from array_split import ARRAY_BOUNDS, NO_BOUNDS
>>> split = shape_split([16,], 4) # No halo
>>> split.shape
(4,)
>>> split
array([(slice(0, 4, None),), (slice(4, 8, None),), (slice(8, 12, None),),
       (slice(12, 16, None),)],
      dtype=[('0', 'O')])
>>> split = shape_split([16,], 4, halo=2, tile_bounds_policy=ARRAY_BOUNDS) # halo width = 2
>>> split.shape
(4,)
>>> split
array([(slice(0, 6, None),), (slice(2, 10, None),), (slice(6, 14, None),),
       (slice(10, 16, None),)],
      dtype=[('0', 'O')])
>>> split = shape_split(
... [16,],
... 4,
... halo=2,
... tile_bounds_policy=NO_BOUNDS  # halo width = 2 and tile halos extend outside array_shape bounds
... )
>>> split.shape
(4,)
>>> split
array([(slice(-2, 6, None),), (slice(2, 10, None),), (slice(6, 14, None),),
       (slice(10, 18, None),)],
      dtype=[('0', 'O')])

The tile_bounds_policy parameter specifies whether the halo extended tiles can extend beyond the bounding box defined by the start index array_start and the stop index array_start + array_shape.

Asymmetric halo extensions can also be specified:

>>> split = shape_split(
... [16,],
... 4,
... halo=((1,2),),
... tile_bounds_policy=NO_BOUNDS
... )
>>> split.shape
(4,)
>>> split
array([(slice(-1, 6, None),), (slice(3, 10, None),), (slice(7, 14, None),),
       (slice(11, 18, None),)],
      dtype=[('0', 'O')])

For an N dimensional split (i.e. N = len(array_shape)), the halo parameter can be either a

scalar
Tiles are extended by halo elements in the negative and positive directions for all axes.
1D sequence
Tiles are extended by halo[i] elements in the negative and positive directions for axis i.
2D sequence
Tiles are extended by halo[i][0] elements in the negative direction and halo[i][1] in the positive direction for axis i.

For example, in 3D:

>>> split = shape_split(
... [16, 8, 8],
... 2,
... halo=1,  # halo=1 in +ve and -ve directions for all axes
... tile_bounds_policy=NO_BOUNDS
... )
>>> split.shape
(2, 1, 1)
>>> split
array([[[(slice(-1, 9, None), slice(-1, 9, None), slice(-1, 9, None))]],

       [[(slice(7, 17, None), slice(-1, 9, None), slice(-1, 9, None))]]],
      dtype=[('0', 'O'), ('1', 'O'), ('2', 'O')])
>>> split = shape_split(
... [16, 8, 8],
... 2,
... halo=(1, 2, 3),  # halo=1 for axis 0, halo=2 for axis 1, halo=3 for axis=2
... tile_bounds_policy=NO_BOUNDS
... )
>>> split.shape
(2, 1, 1)
>>> split
array([[[(slice(-1, 9, None), slice(-2, 10, None), slice(-3, 11, None))]],

       [[(slice(7, 17, None), slice(-2, 10, None), slice(-3, 11, None))]]],
      dtype=[('0', 'O'), ('1', 'O'), ('2', 'O')])
>>> split = shape_split(
... [16, 8, 8],
... 2,
... halo=((1, 2), (3, 4), (5, 6)),  # halo=1 for -ve axis 0, halo=2 for +ve axis 0
...                                 # halo=3 for -ve axis 1, halo=4 for +ve axis 1
...                                 # halo=5 for -ve axis 2, halo=6 for +ve axis 2
... tile_bounds_policy=NO_BOUNDS
... )
>>> split.shape
(2, 1, 1)
>>> split
array([[[(slice(-1, 10, None), slice(-3, 12, None), slice(-5, 14, None))]],

       [[(slice(7, 18, None), slice(-3, 12, None), slice(-5, 14, None))]]],
      dtype=[('0', 'O'), ('1', 'O'), ('2', 'O')])

The array_split Package

Small python package for splitting a numpy.ndarray (or just an array shape) into a number of sub-arrays.

The two main functions are:

array_split.array_split()
Similar to numpy.array_split(), returns a list of views of sub-arrays of the input numpy.ndarray. Can split along multiple axes and has more splitting criteria (parameters) than numpy.array_split().
array_split.shape_split()
Instead taking an numpy.ndarray as an argument, it takes the array shape and returns tuples of slice objects which indicate the extents of the sub-arrays.

These two functions use an instance of the array_split.ShapeSplitter class which contains the bulk of the split implementation and maintains some state related to the computed split.

Splitting of multi-dimensional arrays can be performed according to several criteria:

  • Per-axis indicies indicating the cut positions.

  • Per-axis number of sub-arrays.

  • Total number of sub-arrays (with optional per-axis number of sections constraints).

  • Specific sub-array shape.

  • Maximum number of bytes for a sub-array with constraints:

    • sub-arrays are an even multiple of a specified sub-tile shape
    • upper limit on the per-axis sub-array shape

The usage documentation is given in the Examples section.

Classes and Functions

shape_split(array_shape, *args, **kwargs) Splits specified array_shape in tiles, returns array of slice tuples.
array_split(ary[, indices_or_sections, ...]) Splits the specified array ary into sub-arrays, returns list of numpy.ndarray.
ShapeSplitter(array_shape[, ...]) Implements array shape splitting.

Attributes

array_split.ARRAY_BOUNDS = <property object>

See array_split.split.ARRAY_BOUNDS

array_split.NO_BOUNDS = <property object>

See array_split.split.NO_BOUNDS

The array_split.split Module

Defines array splitting functions and classes.

Classes and Functions

shape_factors(n[, dim]) Returns a numpy.ndarray of factors f such that (len(f) == dim) and (numpy.product(f) == n).
calculate_num_slices_per_axis(...[, ...]) Returns a numpy.ndarray (return_array say) where non-positive elements of
calculate_tile_shape_for_max_bytes(...[, ...]) Returns a tile shape tile_shape such that numpy.product(tile_shape)*numpy.sum(array_itemsize) <= max_tile_bytes.
convert_halo_to_array_form(halo, ndim) Converts the halo argument to a (ndim, 2) shaped array.
ShapeSplitter(array_shape[, ...]) Implements array shape splitting.
shape_split(array_shape, *args, **kwargs) Splits specified array_shape in tiles, returns array of slice tuples.
array_split(ary[, indices_or_sections, ...]) Splits the specified array ary into sub-arrays, returns list of numpy.ndarray.

Attributes

array_split.split.ARRAY_BOUNDS = <property object>

Indicates that tiles are always within the array bounds, resulting in tiles which have truncated halos. See The halo parameter examples.

array_split.split.NO_BOUNDS = <property object>

Indicates that tiles may have halos which extend beyond the array bounds. See The halo parameter examples.

The array_split.split_test Module

Module defining array_split.split unit-tests. Execute as:

python -m array_split.split_tests

Classes

SplitTest([methodName]) unittest.TestCase for array_split.split functions.

The array_split.tests Module

Module for running all array_split unit-tests, including unittest test-cases and doctest tests for module doc-strings and sphinx (RST) documentation. Execute as:

python -m array_split.tests

The array_split.logging Module

Default initialisation of python logging.

Some simple wrappers of python built-in logging module for array_split logging.

Classes and Functions

SplitStreamHandler([outstr, errstr, splitlevel]) A python logging.handlers Handler class for splitting logging messages to different streams depending on the logging-level.
initialise_loggers(names[, log_level, ...]) Initialises specified loggers to generate output at the specified logging level.
get_formatter([prefix_string]) Returns logging.Formatter object which produces messages with time and prefix_string prefix.

The array_split.unittest Module

Some simple wrappers of python built-in unittest module for array_split unit-tests.

Classes and Functions

main(module_name[, log_level, init_logger_names]) Small wrapper for unittest.main() which initialises logging.Logger objects.
TestCase([methodName]) Extends unittest.TestCase with the assertArraySplitEqual().

The array_split.license Module

License and copyright info.

License

Copyright (C) 2017 The Australian National University.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Functions

license() Returns the array_split license string.
copyright() Returns the array_split copyright string.