[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #MRB-844393]: make check failed
- Subject: [netCDF #MRB-844393]: make check failed
- Date: Tue, 24 Jun 2014 10:13:23 -0600
Hello Wei-keng,
That you for following up. I do not know if it is pertinent, but we only
see the issue if we specify more MPI processes than processors available to
them. For example, specifying 4 processes on a VM with 4 CPU's allocated
to it works. If I reduce the allocation to 2 CPU's, we start to see the
failure. I can envision circumstances in which the expected behavior would
be a failure, and circumstances in which the expected behavior would still
be successful tests. Can you provide guidance as to which I should expect
to see?
The error we're receiving is along the lines of 'no such file/file
inaccessible'; I don't have the exact string in front of me. Hopefully
that is useful. I'm glad to hear you are seeing successes, that makes me
even more confident that the issue is with our test, not the pnetcdf
functionality.
Have a great morning, I appreciate your input on this.
-Ward
address@hidden> wrote:
> New Client Reply: make check failed
>
> Hi, Ward,
>
> After carefully reading one of your earlier posts, I notice that the error
> came from line 170, which is a call to nc_open_par(). It indicates this is
> file open issue, not the read/write. Sorry for giving you the wrong
> information.
> When I see the words "Unexpected result", my first instinct (from testing
> other
> nc test programs) told me that it came from a checking of read-after-write.
>
> I tested tst_parallel2.c using netcdf 4.3.0 and PnetCDF 1.4.1 using NFS
> running 1, 4, 8 MPI processes on a cluster with 8 nodes (dual cores each.)
> All tests ran successfully. Same for netCDF 4.3.2 (netCDF 4.3.2 has a
> compile
> warning against PnetCDF 1.4.1 due to "NC4_LAST_ERROR" redefined. I will fix
> that in PnetCDF 1.5.0.)
>
> In your case, maybe you want to call nc_strerror() to check the returned
> error
> code for further info.
>
> Wei-keng
>
> On Jun 23, 2014, at 5:21 PM, Unidata netCDF Support wrote:
>
> > Hello We-keng,
> >
> > Thank you very much for the information! It is very helpful, and sheds
> some
> > light on the expected behavior for pnetcdf. I will pass the information
> > along!
> >
> > Thanks once again, have a great day,
> >
> > -Ward
> >
> >
> > address@hidden> wrote:
> >
> >> New Client Reply: make check failed
> >>
> >> Hi, Ward
> >>
> >> I just saw this post on-line. Since it involves PnetCDF, maybe
> >> I can provide some info from my past experience.
> >>
> >> The "unexpected result" in parallel runs but not sequential runs
> >> is usually caused by the non-POSIX file system (in most of the
> >> cases, NFS). This is because the data written earlier is still cached
> >> on local memory and reading it from a remote node will fail to read
> >> the expected data. To repeat this failure, you can use -machinefile
> >> option of mpiexec command to use 2 or more compute nodes.
> >>
> >> In PnetCDF, we recommend users to use parallel file systems
> >> (PVFS, Lustre, GPFS etc.) for running "make ptest" (parallel test),
> >> because of the above reason. Let me know if you are using NFS and my
> >> reasoning of the cause makes sense to you.
> >>
> >> (hope this email can get through, as I am not a registered user of this
> >> mailing list.)
> >>
> >> Wei-keng
> >>
> >>> Hello Juanjo,
> >>>
> >>> After a bit of investigation, we've come to the conclusion that the
> issue
> >>> is likely with the test itself, not with an error with the pnetcdf
> >>> functionality. This test runs fine as a single process, and only began
> >>> failing recently when we started invoking it via 'mpiexec', with more
> >> than
> >>> 2 processes. I've spent a lot of time trying to track down where an
> >> error
> >>> would have been introduced, and I cannot find a place in our archives
> >> where
> >>> the test passed with multiple processes.
> >>>
> >>> I will continue to investigate this and work on fixing the test so that
> >> it
> >>> works properly. In the meantime, as long as your pnetcdf library
> passed
> >>> 'make testing', I think you should be ok and can ignore this error.
> >>>
> >>> Thank you very much for bringing this to our attention!
> >>>
> >>> Have a good week,
> >>>
> >>> -Ward
> >>>
> >>>
> >>
> >>
> >>
> >> Ticket Details
> >> ===================
> >> Ticket ID: MRB-844393
> >> Department: Support netCDF
> >> Priority: Normal
> >> Status: Open
> >> Link:
> >>
> https://www.unidata.ucar.edu/esupport/staff/index.php?_m=tickets&_a=viewticket&ticketid=24086
> >>
> >>
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: MRB-844393
> > Department: Support netCDF
> > Priority: Normal
> > Status: Open
> >
>
>
>
> Ticket Details
> ===================
> Ticket ID: MRB-844393
> Department: Support netCDF
> Priority: Normal
> Status: Open
> Link:
> https://www.unidata.ucar.edu/esupport/staff/index.php?_m=tickets&_a=viewticket&ticketid=24086
>
>
Ticket Details
===================
Ticket ID: MRB-844393
Department: Support netCDF
Priority: Normal
Status: Open