[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Support #RRP-498133]: Re: : netCDF #IYL-401919: HDF5/NetCDF4 error with lustre NO_FLOCK
- Subject: [Support #RRP-498133]: Re: : netCDF #IYL-401919: HDF5/NetCDF4 error with lustre NO_FLOCK
- Date: Tue, 26 Mar 2019 09:37:14 -0600
[Minor point- please send (or cc to) address@hidden
and not to address@hidden]
So at this point, I have lost the context of the netcdf-c related
problem. Now that we have some understanding of the HDF5 issue,
can you restate the netcdf-c problem in that context?
> Hmmm....
>
> [ictp@wombele17 hdf5-1.10.5]$ env | grep HDF5
> HDF5_USE_FILE_LOCKING=FALSE
> [ictp@wombele17 hdf5-1.10.5]$ make check
> ===Serial tests in test begin mar 26 mar 2019, 10.43.13, CET===
> make[4]: Entering directory `/SCRATCH/ictp/testh5/hdf5-1.10.5/test'
> ============================
> No need to test testhdf5 again.
> ============================
> No need to test cache again.
> ============================
> No need to test cache_api again.
> ============================
> No need to test cache_image again.
> ============================
> No need to test cache_tagging again.
> ============================
> No need to test lheap again.
> ============================
> No need to test ohdr again.
> ============================
> No need to test stab again.
> ============================
> No need to test gheap again.
> ============================
> No need to test evict_on_close again.
> ============================
> No need to test farray again.
> ============================
> No need to test earray again.
> ============================
> No need to test btree2 again.
> ============================
> No need to test fheap again.
> ============================
> No need to test pool again.
> ============================
> No need to test accum again.
> ============================
> No need to test hyperslab again.
> ============================
> No need to test istore again.
> ============================
> No need to test bittests again.
> ============================
> No need to test dt_arith again.
> ============================
> No need to test page_buffer again.
> ============================
> No need to test dtypes again.
> ============================
> No need to test dsets again.
> ============================
> No need to test chunk_info again.
> ============================
> No need to test cmpd_dset again.
> ============================
> No need to test filter_fail again.
> ============================
> No need to test extend again.
> ============================
> No need to test direct_chunk again.
> ============================
> No need to test external again.
> ============================
> No need to test efc again.
> ============================
> No need to test objcopy again.
> ============================
> No need to test links again.
> ============================
> No need to test unlink again.
> ============================
> No need to test twriteorder again.
> ============================
> No need to test big again.
> ============================
> No need to test mtime again.
> ============================
> No need to test fillval again.
> ============================
> No need to test mount again.
> ============================
> No need to test flush1 again.
> ============================
> No need to test flush2 again.
> ============================
> No need to test app_ref again.
> ============================
> No need to test enum again.
> ============================
> No need to test set_extent again.
> ============================
> No need to test ttsafe again.
> ============================
> No need to test enc_dec_plist again.
> ============================
> No need to test enc_dec_plist_cross_platform again.
> ============================
> No need to test getname again.
> ============================
> No need to test vfd again.
> ============================
> No need to test ntypes again.
> ============================
> No need to test dangle again.
> ============================
> No need to test dtransform again.
> ============================
> No need to test reserved again.
> ============================
> No need to test cross_read again.
> ============================
> No need to test freespace again.
> ============================
> No need to test mf again.
> ============================
> No need to test vds again.
> ============================
> No need to test file_image again.
> ============================
> No need to test unregister again.
> ============================
> No need to test cache_logging again.
> ============================
> No need to test cork again.
> ============================
> Testing swmr
> ============================
> swmr Test Log
> ============================
> Testing H5Drefresh()--concurrent access for latest format
> PASSED
> Testing H5Drefresh()--concurrent access for non-latest-format
> PASSED
> Testing multiple--single process access for latest format
> *FAILED*
> HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
> #000: H5F.c line 1354 in H5Fstart_swmr_write(): unable to convert
> file format
> major: File accessibilty
> minor: Can't convert datatypes
> #001: H5Fint.c line 3410 in H5F__start_swmr_write(): unable to unlock
> the file
> major: File accessibilty
> minor: Unable to open file
> #002: H5FD.c line 1698 in H5FD_unlock(): driver unlock request failed
> major: Virtual File Layer
> minor: Can't update object
> #003: H5FDsec2.c line 990 in H5FD_sec2_unlock(): file locking
> disabled on this file system (use HDF5_USE_FILE_LOCKING environment
> variable to override), errno = 38, error message = 'Function not
> implemented'
> major: File accessibilty
> minor: Bad file ID accessed
> *FAILED*
> [...]
>
> Looks like that the environment variable is set, but test fails.
> Moreover, the problem with unreliable netCDF4 on top of HDF5 persists.
>
> Testing swmr
> ============================
> swmr Test Log
> ============================
> Testing H5Drefresh()--concurrent access for latest format
> PASSED
> Testing H5Drefresh()--concurrent access for non-latest-format
> PASSED
> Added by Graziano here
> HDF5_USE_FILE_LOCKING = FALSE
> Re-setting variable
> HDF5_USE_FILE_LOCKING = FALSE
> Testing multiple--single process access for latest format
> *FAILED*
> HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 0:
>
> --- swmr_org.c 2019-03-26 10:56:07.000000000 +0100
> +++ swmr.c 2019-03-26 10:54:50.000000000 +0100
> @@ -6723,6 +6723,14 @@
> int rbuf = 0;
> int wbuf = 0;
>
> + fprintf(stderr,"Added by Graziano here\n");
> + fprintf(stderr,"HDF5_USE_FILE_LOCKING = %s\n",
> + getenv("HDF5_USE_FILE_LOCKING"));
> + fprintf(stderr,"Re-setting variable\n");
> + (void) setenv("HDF5_USE_FILE_LOCKING","FALSE",1);
> + fprintf(stderr,"HDF5_USE_FILE_LOCKING = %s\n",
> + getenv("HDF5_USE_FILE_LOCKING"));
> +
> /* Output message about test being performed */
> if(new_format) {
> TESTING("multiple--single process access for latest format");
>
> ##################
>
> Where is this execve?
>
> G.
>
> Il 25/03/19 22:03, HDF Helpdesk ha scritto:
> >
> > Hi Graziano,
> >
> > The developer sent this follow-up, which looks useful to know:
> >
> > Note that the environment variable is checked on each call to H5Fopen/close
> > (actually H5F_open, internally). Environments are weird, though, so they
> > may be
> > brushing up against rough edges there. For example, if I recall correctly, a
> > process makes a copy of its environment variables when it starts up and
> > setenv()
> > modifies the copy, so firing off two processes and calling setenv() in one
> > won't
> > modify the environment in the other. I'd also be willing to bet that
> > forking a
> > new process and setting the environment variable in one will not change it
> > in
> > the other. Also, if they are using execve() and not passing the environment
> > in,
> > that is obviously going to be a problem. The safest thing to do would be to
> > set
> > HDF5_USE_FILE_LOCKING on the command line of any tests they run and be
> > careful with
> > execve().
> >
> > -Barbara
> >
> > ==============================================
> > Barbara Jones, The HDF Group, www.hdfgroup.org
> > Service Desk: help.hdfgroup.org
> > Email Address: address@hidden
> > ==============================================
> >
> >
> >
>
>
=Dennis Heimbigner
Unidata
Ticket Details
===================
Ticket ID: RRP-498133
Department: Support netCDF
Priority: Normal
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.