This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Sebastian, > I tried to compile with netcCDF 4.3.1-rc2 but now, my program now > craches because of an MPI error: > > *** An error occurred in MPI_Allreduce: the reduction operation MPI_MAX > is not defined on the MPI_BYTE datatype > *** on communicator MPI COMMUNICATOR 4 DUP FROM 0 > *** MPI_ERR_OP: invalid reduce operation > *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) > > I'm using OpenMPI 1.4.3. I'm assuming the program that crashes is the test.cpp you attached in your original support question. I tried to duplicate the problem using OpenMPI 1.7.2_1 on an OSX platform, and got a different error: $ mpicxx test.cpp -o test -I{NCDIR}/include -I${H5DIR}/include -L${NCDIR}/lib -L${H5DIR}/lib -lnetcdf -lhdf5_hl -lhdf5 -ldl -lm -lz -lcurl ./test Start on rank 0: 0 0 Count on rank 0: 1 0 Assertion failed: (size), function H5MM_calloc, file ../../src/H5MM.c, line 95. [mort:71677] *** Process received signal *** [mort:71677] Signal: Abort trap: 6 (6) [mort:71677] Signal code: (0) [mort:71677] [ 0] 2 libsystem_c.dylib 0x00007fff939b994a _sigtramp + 26 [mort:71677] [ 1] 3 ??? 0x0000000000000000 0x0 + 0 [mort:71677] [ 2] 4 libsystem_c.dylib 0x00007fff93a11e2a __assert_rtn + 146 [mort:71677] [ 3] 5 test 0x0000000108eeea10 H5MM_calloc + 256 [mort:71677] [ 4] 6 test 0x0000000108d4ca3e H5D__chunk_io_init + 1534 [mort:71677] [ 5] 7 test 0x0000000108d8a45c H5D__write + 4028 [mort:71677] [ 6] 8 test 0x0000000108d87460 H5D__pre_write + 3552 [mort:71677] [ 7] 9 test 0x0000000108d8658c H5Dwrite + 732 [mort:71677] [ 8] 10 test 0x0000000108c8ac27 nc4_put_vara + 3991 [mort:71677] [ 9] 11 test 0x0000000108ca0564 nc4_put_vara_tc + 164 [mort:71677] [10] 12 test 0x0000000108ca04ab NC4_put_vara + 75 [mort:71677] [11] 13 test 0x0000000108c08240 NC_put_vara + 288 [mort:71677] [12] 14 test 0x0000000108c092d4 nc_put_vara_int + 100 [mort:71677] [13] 15 test 0x0000000108bf2e56 main + 630 [mort:71677] [14] 16 libdyld.dylib 0x00007fff886fd7e1 start + 0 [mort:71677] [15] 17 ??? 0x0000000000000001 0x0 + 1 [mort:71677] *** End of error message *** Abort > I think, the bug was introduced in this commit: > https://github.com/Unidata/netcdf-c/pull/4 We're looking at the problem, thanks for reporting it. --Russ > Best regards, > Sebastian > > On 22.08.2013 18:28, Unidata netCDF Support wrote: > > Hi Sebastian, > > > >> my problem sounds similar but to the bug but it is different. My program > >> also hangs when using collective MPI I/O. > >> > >> According to the bug report, only an issue with independent I/O was fixed. > > > > You're right, but we think we have a fix for the collective I/O hang now, > > available in the netCDF-C 4.3.1-rc2 version (a release candidate): > > > > https://github.com/Unidata/netcdf-c/releases/tag/v4.3.1-rc2 > > > > At your convenience, please let us know if it fixes the problem. > > > > --Russ > > > >> On 06.08.2013 00:09, Unidata netCDF Support wrote: > >>> Hi Sebastian, > >>> > >>> Could you tell us if this recently fixed bug sounds like what you > >>> found? > >>> > >>> https://bugtracking.unidata.ucar.edu/browse/NCF-250 > >>> > >>> If so, the fix will be in netCDF release 4.3.1, a release candidate > >>> for which will soon be announced. > >>> > >>> --Russ > >>> > >>>> Hi everybody, > >>>> > >>>> I just figured out that using collective MPI/IO in variables with > >>>> unlimited dimensions can lead to deadlocks or wrong files. > >>>> > >>>> I have attached a small example program which can reproduce deadlock > >>>> (and wrong output files depending on the variable "count"). > >>>> > >>>> Did I do anything wrong or is this a known bug? > >>>> > >>>> My configuration: > >>>> hdf5 1.8.11 > >>>> netcdf 4.3 > >>>> openmpi (default ubuntu installation) > >>>> > >>>> Compile command: > >>>> mpicxx test.cpp -I/usr/local/include -L/usr/local/lib -lnetcdf -lhdf5_hl > >>>> -lhdf5 -lz > >>>> (netcdf and hdf5 are installed in /usr/local) > >>>> > >>>> Best regards, > >>>> Sebastian > >>>> > >>>> -- > >>>> Sebastian Rettenberger, M.Sc. > >>>> Technische Universität München > >>>> Department of Informatics > >>>> Chair of Scientific Computing > >>>> Boltzmannstrasse 3, 85748 Garching, Germany > >>>> http://www5.in.tum.de/ > >>>> > >>>> > >>> Russ Rew UCAR Unidata Program > >>> address@hidden http://www.unidata.ucar.edu > >>> > >>> > >>> > >>> Ticket Details > >>> =================== > >>> Ticket ID: RQB-854711 > >>> Department: Support netCDF > >>> Priority: Normal > >>> Status: Closed > >>> > >> > >> -- > >> Sebastian Rettenberger, M.Sc. > >> Technische Universität München > >> Department of Informatics > >> Chair of Scientific Computing > >> Boltzmannstrasse 3, 85748 Garching, Germany > >> http://www5.in.tum.de/ > >> > >> > >> > > Russ Rew UCAR Unidata Program > > address@hidden http://www.unidata.ucar.edu > > > > > > > > Ticket Details > > =================== > > Ticket ID: RQB-854711 > > Department: Support netCDF > > Priority: Normal > > Status: Closed > > > > -- > Sebastian Rettenberger, M.Sc. > Technische Universität München > Department of Informatics > Chair of Scientific Computing > Boltzmannstrasse 3, 85748 Garching, Germany > http://www5.in.tum.de/ > > > > Hello, > > I tried to compile with netcCDF 4.3.1-rc2 but now, my program now > craches because of an MPI error: > > *** An error occurred in MPI_Allreduce: the reduction operation MPI_MAX > is not defined on the MPI_BYTE datatype > *** on communicator MPI COMMUNICATOR 4 DUP FROM 0 > *** MPI_ERR_OP: invalid reduce operation > *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) > > I'm using OpenMPI 1.4.3. > > I think, the bug was introduced in this commit: > https://github.com/Unidata/netcdf-c/pull/4 > > Best regards, > Sebastian > > On 22.08.2013 18:28, Unidata netCDF Support wrote: > > Hi Sebastian, > > > >> my problem sounds similar but to the bug but it is different. My program > >> also hangs when using collective MPI I/O. > >> > >> According to the bug report, only an issue with independent I/O was fixed. > > > > You're right, but we think we have a fix for the collective I/O hang now, > > available in the netCDF-C 4.3.1-rc2 version (a release candidate): > > > > https://github.com/Unidata/netcdf-c/releases/tag/v4.3.1-rc2 > > > > At your convenience, please let us know if it fixes the problem. > > > > --Russ > > > >> On 06.08.2013 00:09, Unidata netCDF Support wrote: > >>> Hi Sebastian, > >>> > >>> Could you tell us if this recently fixed bug sounds like what you > >>> found? > >>> > >>> https://bugtracking.unidata.ucar.edu/browse/NCF-250 > >>> > >>> If so, the fix will be in netCDF release 4.3.1, a release candidate > >>> for which will soon be announced. > >>> > >>> --Russ > >>> > >>>> Hi everybody, > >>>> > >>>> I just figured out that using collective MPI/IO in variables with > >>>> unlimited dimensions can lead to deadlocks or wrong files. > >>>> > >>>> I have attached a small example program which can reproduce deadlock > >>>> (and wrong output files depending on the variable "count"). > >>>> > >>>> Did I do anything wrong or is this a known bug? > >>>> > >>>> My configuration: > >>>> hdf5 1.8.11 > >>>> netcdf 4.3 > >>>> openmpi (default ubuntu installation) > >>>> > >>>> Compile command: > >>>> mpicxx test.cpp -I/usr/local/include -L/usr/local/lib -lnetcdf -lhdf5_hl > >>>> -lhdf5 -lz > >>>> (netcdf and hdf5 are installed in /usr/local) > >>>> > >>>> Best regards, > >>>> Sebastian > >>>> > >>>> -- > >>>> Sebastian Rettenberger, M.Sc. > >>>> Technische Universität München > >>>> Department of Informatics > >>>> Chair of Scientific Computing > >>>> Boltzmannstrasse 3, 85748 Garching, Germany > >>>> http://www5.in.tum.de/ > >>>> > >>>> > >>> Russ Rew UCAR Unidata Program > >>> address@hidden http://www.unidata.ucar.edu > >>> > >>> > >>> > >>> Ticket Details > >>> =================== > >>> Ticket ID: RQB-854711 > >>> Department: Support netCDF > >>> Priority: Normal > >>> Status: Closed > >>> > >> > >> -- > >> Sebastian Rettenberger, M.Sc. > >> Technische Universität München > >> Department of Informatics > >> Chair of Scientific Computing > >> Boltzmannstrasse 3, 85748 Garching, Germany > >> http://www5.in.tum.de/ > >> > >> > >> > > Russ Rew UCAR Unidata Program > > address@hidden http://www.unidata.ucar.edu > > > > > > > > Ticket Details > > =================== > > Ticket ID: RQB-854711 > > Department: Support netCDF > > Priority: Normal > > Status: Closed > > > > -- > Sebastian Rettenberger, M.Sc. > Technische Universität München > Department of Informatics > Chair of Scientific Computing > Boltzmannstrasse 3, 85748 Garching, Germany > http://www5.in.tum.de/ > > > Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: RQB-854711 Department: Support netCDF Priority: Normal Status: Closed