This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Reto, > Yes, the POSIX parallel I/O tests fail on OSX with OpenMPI, but that is fine. > OSX and OpenMPI uses MPIIO. So to my understanding the parallel tests are ok > if either POSIX or MPIIO work and the other one fails. > > I am actually not using a parallel file system on OSX. I use the regular file > system (basic OSX installation) and I think that the parallel I/O has to work > in collective and independent mode even when using a regular file system. I'm curious how you installed parallel HDF5, because my "make check" fails before finishing the tests. Did you build HDF5 without --enable-parallel, or without using CC=mpicc? Or did you build it with parallel I/O, but run "make install" even though "make check" failed as a result of not having a parallel file system? --Russ > I will test the same installation on Linux and then start debugging on OSX, > and maybe we find out something. > > Btw. the netcdf-fortran 4.4 beta failed to compile alltogether on OSX, so I'm > still using netcdf-fortran 4.2. > > Have a great weekend, > > Reto > > > On Apr 12, 2013, at 5:59 PM, Unidata netCDF Support wrote: > > > Reto, > > > >> I've tried the following configuration > >> - hdf5 1.8.11-snap16 > >> - netcdf-4.3.0-rc4 > >> - netcdf-fortran-4.2 > >> - openmpi-1.6.3 > >> - gcc/gfortran 4.6.3 > >> > >> Same issue. If I let all processes do the write, then it works fine. If I > >> for instance exclude process #0,1,2 or 3 from the writing, then the write > >> hangs (all metadata/open/close is collective, only the write is > >> independent.). It seems to me that somehow on my system all writes are > >> collective by default and thus the write operation is not executed as > >> independent. > >> > >> Do you have a configuration with openmpi on OSX somewhere around? > > > > Yes, I had to deactivate my mpich configuration first, but now have openmpi > > 1.6.4 on > > OSX 10.8.3. However, when I try to build hdf5 1.8.11-pre1 with it, using > > > > CC=/opt/local/lib/openmpi/bin/mpicc ./configure > > make > > make check > > > > Some tests fail in "make check", for example testing "ph5diff > > h5diff_basiccl.h5", that > > may be due to not having a POSIX-compliant parallel file system installed. > > Also I > > jut noticed that the earlier test t_posix_compliant test for > > allwrite_allread_blocks > > with POSIX IO failed, though it returned 0 so as not to stop the hdf5 > > testing. > > > > > > Are you using a parallel file system? Do you set the environment variable > > HDF5_PARAPREFIX to a directory in a parallel file system? What file system > > are you > > using for your parallel I/O tests? > > > > I'm afraid I don't know much about parallel I/O, and the netCDF parallel > > I/O expert > > got lured away to a different job some time ago, so we may need some help > > or pointers > > where to look to install a parallel file system on our OS X platform for > > this kind of > > testing and debugging. > > > >> I will start putting some debugging commands into the netcdf-fortran > >> library and see where the process really hangs and whether the > >> collective/independent write is executed correctly. > > > > Thanks, that would be helpful ... > > > > --Russ > > > >> Reto > >> > >> > >> On Apr 9, 2013, at 11:01 PM, Unidata netCDF Support wrote: > >> > >>> Hi Reto, > >>> > >>> Sorry to have taken so long to respond to your question. > >>>> I have been using NetCDF-4 Parallel I/O with the Fortran 90 interface > >>>> for some time with success. Thank you for this great tool! > >>>> > >>>> However, I now have an issue with independent access: > >>>> > >>>> - NetCDF F90 Parallel access (NetCDF-4, MPIIO) > >>>> - 3 fixed and 1 unlimited dimension > >>>> - alle processes open/close the file and write metadata > >>>> - only a few processes write to the file (-> independent access) > >>>> - the write hangs. It works fine if all processes take place. > >>>> > >>>> I've changed your example F90 parallel I/O file simple_xy_par_wr.f90 to > >>>> include a unlimited dimension and independent access of only a subset of > >>>> processes. Same issue. Even if I explicitly set the access type to > >>>> independent for the variable. Can you reproduce the issue on your side? > >>>> > >>>> The following system configuration on my side: > >>>> - NetCDF 4.2.1.1 and F90 interface 4.2 > >>>> - hdf5 1.8.9 > >>>> - Openmpi 1. > >>>> - OSX, gcc 4.6.3 > >>> > >>> No, I haven't been able to reproduce the issue, but I can't exactly > >>> duplicate > >>> your configuration easily, and there have been some updates and bug fixes > >>> that > >>> may have made a difference. > >>> > >>> First I tried this configuration, which worked fine on your attached > >>> example: > >>> > >>> - NetCDF 4.3.0-rc4 and F90 interface 4.2 > >>> - hdf5 1.8.11 (release candidate from svn repository) > >>> - mpich2-1.3.1 > >>> - Linux Fedora, mpicc, mpif90 wrapping gcc, gfortran 4.5.1 > >>> > >>> So if you can build those versions, it should work for you. I'm not sure > >>> whether > >>> the fix is in netCDF-4.3.0 or in hdf5-1.8.11, but both have a fix for at > >>> least one > >>> parallel I/O hanging process issue: > >>> > >>> https://bugtracking.unidata.ucar.edu/browse/NCF-214 (fix in netCDF-4.3.0) > >>> https://bugtracking.unidata.ucar.edu/browse/NCF-240 (fix in HDF5-1.8.11) > >>> > >>> --Russ > >>> > >>> Russ Rew UCAR Unidata Program > >>> address@hidden http://www.unidata.ucar.edu > >>> > >>> > >>> > >>> Ticket Details > >>> =================== > >>> Ticket ID: TIR-820282 > >>> Department: Support netCDF > >>> Priority: High > >>> Status: Closed > >>> > >> > >> > > > > Russ Rew UCAR Unidata Program > > address@hidden http://www.unidata.ucar.edu > > > > > > > > Ticket Details > > =================== > > Ticket ID: TIR-820282 > > Department: Support netCDF > > Priority: High > > Status: Closed > > > > Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: TIR-820282 Department: Support netCDF Priority: High Status: Closed