This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Reto, > I've tried the following configuration > - hdf5 1.8.11-snap16 > - netcdf-4.3.0-rc4 > - netcdf-fortran-4.2 > - openmpi-1.6.3 > - gcc/gfortran 4.6.3 > > Same issue. If I let all processes do the write, then it works fine. If I for > instance exclude process #0,1,2 or 3 from the writing, then the write hangs > (all metadata/open/close is collective, only the write is independent.). It > seems to me that somehow on my system all writes are collective by default > and thus the write operation is not executed as independent. > > Do you have a configuration with openmpi on OSX somewhere around? Yes, I had to deactivate my mpich configuration first, but now have openmpi 1.6.4 on OSX 10.8.3. However, when I try to build hdf5 1.8.11-pre1 with it, using CC=/opt/local/lib/openmpi/bin/mpicc ./configure make make check Some tests fail in "make check", for example testing "ph5diff h5diff_basiccl.h5", that may be due to not having a POSIX-compliant parallel file system installed. Also I jut noticed that the earlier test t_posix_compliant test for allwrite_allread_blocks with POSIX IO failed, though it returned 0 so as not to stop the hdf5 testing. Are you using a parallel file system? Do you set the environment variable HDF5_PARAPREFIX to a directory in a parallel file system? What file system are you using for your parallel I/O tests? I'm afraid I don't know much about parallel I/O, and the netCDF parallel I/O expert got lured away to a different job some time ago, so we may need some help or pointers where to look to install a parallel file system on our OS X platform for this kind of testing and debugging. > I will start putting some debugging commands into the netcdf-fortran library > and see where the process really hangs and whether the collective/independent > write is executed correctly. Thanks, that would be helpful ... --Russ > Reto > > > On Apr 9, 2013, at 11:01 PM, Unidata netCDF Support wrote: > > > Hi Reto, > > > > Sorry to have taken so long to respond to your question. > >> I have been using NetCDF-4 Parallel I/O with the Fortran 90 interface for > >> some time with success. Thank you for this great tool! > >> > >> However, I now have an issue with independent access: > >> > >> - NetCDF F90 Parallel access (NetCDF-4, MPIIO) > >> - 3 fixed and 1 unlimited dimension > >> - alle processes open/close the file and write metadata > >> - only a few processes write to the file (-> independent access) > >> - the write hangs. It works fine if all processes take place. > >> > >> I've changed your example F90 parallel I/O file simple_xy_par_wr.f90 to > >> include a unlimited dimension and independent access of only a subset of > >> processes. Same issue. Even if I explicitly set the access type to > >> independent for the variable. Can you reproduce the issue on your side? > >> > >> The following system configuration on my side: > >> - NetCDF 4.2.1.1 and F90 interface 4.2 > >> - hdf5 1.8.9 > >> - Openmpi 1. > >> - OSX, gcc 4.6.3 > > > > No, I haven't been able to reproduce the issue, but I can't exactly > > duplicate > > your configuration easily, and there have been some updates and bug fixes > > that > > may have made a difference. > > > > First I tried this configuration, which worked fine on your attached > > example: > > > > - NetCDF 4.3.0-rc4 and F90 interface 4.2 > > - hdf5 1.8.11 (release candidate from svn repository) > > - mpich2-1.3.1 > > - Linux Fedora, mpicc, mpif90 wrapping gcc, gfortran 4.5.1 > > > > So if you can build those versions, it should work for you. I'm not sure > > whether > > the fix is in netCDF-4.3.0 or in hdf5-1.8.11, but both have a fix for at > > least one > > parallel I/O hanging process issue: > > > > https://bugtracking.unidata.ucar.edu/browse/NCF-214 (fix in netCDF-4.3.0) > > https://bugtracking.unidata.ucar.edu/browse/NCF-240 (fix in HDF5-1.8.11) > > > > --Russ > > > > Russ Rew UCAR Unidata Program > > address@hidden http://www.unidata.ucar.edu > > > > > > > > Ticket Details > > =================== > > Ticket ID: TIR-820282 > > Department: Support netCDF > > Priority: High > > Status: Closed > > > > Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: TIR-820282 Department: Support netCDF Priority: High Status: Closed