What’s new in h5py 3.8¶
New features¶
h5py now has pre-built packages for Python 3.11.
h5py is compatible with HDF5 1.14 (PR 2187). Pre-built packages on PyPI still include HDF5 1.12 for now.
Fancy indexing now accepts tuples, or any other sequence type, rather than only lists and NumPy arrays. This also includes
range
objects, but this will normally be less efficient than the equivalent slice.New property
Dataset.is_scale
for checking if the dataset is a dimension scale (PR 2168).Group.require_dataset()
now validatesmaxshape
for resizable datasets (PR 2116).File
now has ameta_block_size
argument and property. This influences how the space for metadata, including the initial header, is allocated.Chunk cache can be configured per individual HDF5 dataset (PR 2127). Use
Group.create_dataset()
for new datasets orGroup.require_dataset()
for already existing datasets. Any combination of therdcc_nbytes
,rdcc_w0
, andrdcc_nslots
arguments is allowed. The file defaults apply to those omitted.HDF5 file names for ros3 driver can now also be
s3://
resource locations (PR 2140). h5py will translate them into AWS path-style URLs for use by the driver.When using the ros3 driver, AWS authentication will be activated only if all three driver arguments are provided. Previously AWS authentication was active if any one of the arguments was set causing an error from the HDF5 library.
Dataset.fields()
now implements the__array__()
method (PR 2151). This speeds up accessing fields with functions that expect this, likenp.asarray()
.Low-level
h5py.h5d.DatasetID.chunk_iter()
method that invokes a user-supplied callable object on every written chunk of one dataset (PR 2202). It provides much better performance when iterating over a large number of chunks.
Exposing HDF5 functions¶
H5Dchunk_iter
ash5py.h5d.DatasetID.chunk_iter()
.H5Pset_meta_block_size and H5Pget_meta_block_size (PR 2106).
Bug fixes¶
Fixed getting the default fill value (an empty string) for variable-length string data (PR 2132).
Complex float16 data could cause a
TypeError
when trying to coerce to the currently unavailable numpy.dtype(‘c4’). Now a compound type is used instead (PR 2157).h5py 3.7 contained a performance regression when using a boolean mask array to index a 1D dataset, which is now fixed (PR 2193).
Building h5py¶
Parallel HDF5 can be built with Microsoft MS-MPI (PR 2147). See Building against Parallel HDF5 for details.
Some ‘incompatible function pointer type’ compile time warnings were fixed (PR 2142).
Fix for finding HDF5 DLL in mingw (PR 2105).