Installing h5py with Parallel HDF5
Last updated: April 1, 2021
The following guide is if you want to install h5py with parallel (mpi) IO features. Therefore, h5py also requires a system-mpi linked mpi4py
installation as well.
Installation:
-
Load the current latest parallel
HDF5
module (e.g.HDF5/1.10.6-CrayGNU-20.11-parallel
).module load HDF5/1.10.6-CrayGNU-20.11-parallel
-
Update environment variables
export MPI_DIR=$MPICH_DIR export MPI_INCLUDE=$MPICH_DIR/include export MPI_LIB=$MPICH_DIR/lib export LB_LIBRARY_PATH=$MPI_DIR/lib:$LD_LIBRARY_PATH export MPICC=CC export mpicc=CC export HDF5_MPI="ON" export HDF5_DIR=/apps/daint/UES/jenkins/7.0.UP02-20.11/gpu/easybuild/software/HDF5/1.10.6-CrayGNU-20.11-parallel
-
Install h5py from source.
a. Clone repo locally:
git clone https://github.com/h5py/h5py
b. Update the
setup_build.py
file with the additional followinginclude_dirs
:settings['include_dirs'] += ['/opt/cray/pe/mpt/7.7.16/gni/mpich-gnu/8.2/include/']
c. Install from locally using
pip
:pip install .
Testing:
Test h5py parallel build using following example script (parallel_h5py.py
):
from mpi4py import MPI
import h5py
comm = MPI.COMM_WORLD
rank = comm.rank
size = comm.size
f = h5py.File('parallel_test.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD)
dset = f.create_dataset('test', (size,), dtype='i')
dset[rank] = rank
f.close()
-
Load system
HDF5/xxx-parallel
and your custom python environment withh5py
andmpi4py
. Runparallel_h5py.py
script:srun python parallel_h5py.py
-
Inspect output:
h5dump parallel_test.h5
Last update:
March 9, 2022