The array_split python package is an enhancement to existing numpy.ndarray functions, such as numpy.array_split, skimage.util.view_as_blocks and skimage.util.view_as_windows, which sub-divide a multi-dimensional array into a number of multi-dimensional sub-arrays (slices). Example application areas include:
- Parallel Processing
- A large (dense) array is partitioned into smaller sub-arrays which can be processed concurrently by multiple processes (multiprocessing or mpi4py) or other memory-limited hardware (e.g. GPGPU using pyopencl, pycuda, etc). For GPGPU, it is necessary for sub-array not to exceed the GPU memory and desirable for the sub-array shape to be a multiple of the work-group (OpenCL) or thread-block (CUDA) size.
- File I/O
- A large (dense) array is partitioned into smaller sub-arrays which can be written to individual files (as, for example, a HDF5 Virtual Dataset). It is often desirable for the individual files not to exceed a specified number of (Giga) bytes and, for HDF5, it is desirable to have the individual file sub-array shape a multiple of the chunk shape. Similarly, out of core algorithms for large dense arrays often involve processing the entire data-set as a series of in-core sub-arrays. Again, it is desirable for the individual sub-array shape to be a multiple of the chunk shape.
The array_split package provides the means to partition an array (or array shape) using any of the following criteria:
Per-axis indices indicating the cut positions.
Per-axis number of sub-arrays.
Total number of sub-arrays (with optional per-axis number of sections constraints).
Specific sub-array shape.
Specification of halo (ghost) elements for sub-arrays.
Arbitrary start index for the shape to be partitioned.
Maximum number of bytes for a sub-array with constraints:
- sub-arrays are an even multiple of a specified sub-tile shape
- upper limit on the per-axis sub-array shape
Quick Start Example¶
>>> from array_split import array_split, shape_split >>> import numpy as np >>> >>> ary = np.arange(0, 4*9) >>> >>> array_split(ary, 4) # 1D split into 4 sections (like numpy.array_split) [array([0, 1, 2, 3, 4, 5, 6, 7, 8]), array([ 9, 10, 11, 12, 13, 14, 15, 16, 17]), array([18, 19, 20, 21, 22, 23, 24, 25, 26]), array([27, 28, 29, 30, 31, 32, 33, 34, 35])] >>> >>> shape_split(ary.shape, 4) # 1D split into 4 parts, returns slice objects array([(slice(0, 9, None),), (slice(9, 18, None),), (slice(18, 27, None),), (slice(27, 36, None),)], dtype=[('0', 'O')]) >>> >>> ary = ary.reshape(4, 9) # Make ary 2D >>> split = shape_split(ary.shape, axis=(2, 3)) # 2D split into 2*3=6 sections >>> split.shape (2, 3) >>> split array([[(slice(0, 2, None), slice(0, 3, None)), (slice(0, 2, None), slice(3, 6, None)), (slice(0, 2, None), slice(6, 9, None))], [(slice(2, 4, None), slice(0, 3, None)), (slice(2, 4, None), slice(3, 6, None)), (slice(2, 4, None), slice(6, 9, None))]], dtype=[('0', 'O'), ('1', 'O')]) >>> sub_arys = [ary[tup] for tup in split.flatten()] # Create sub-array views from slice tuples. >>> sub_arys [array([[ 0, 1, 2], [ 9, 10, 11]]), array([[ 3, 4, 5], [12, 13, 14]]), array([[ 6, 7, 8], [15, 16, 17]]), array([[18, 19, 20], [27, 28, 29]]), array([[21, 22, 23], [30, 31, 32]]), array([[24, 25, 26], [33, 34, 35]])]
Latest sphinx documentation (including more examples) at http://array-split.readthedocs.io/en/latest/.
pip (root access required):
pip install array_split
or local user install (no root access required):
pip install --user array_split
or local user install from latest github source:
pip install --user git+git://github.com/array-split/array_split.git#egg=array_split
Run tests (unit-tests and doctest module docstring tests) using:
python -m array_split.tests
or, from the source tree, run:
python setup.py test
Travis CI at:
and AppVeyor at:
Latest sphinx generated documentation is at:
and at github gh-pages:
Sphinx documentation can be built from the source:
python setup.py build_sphinx
with the HTML generated in
To search for bugs or report them, please use the bug tracker at: