[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #IJQ-160428]: NetCDF no-fill option
- Subject: [netCDF #IJQ-160428]: NetCDF no-fill option
- Date: Tue, 25 Mar 2014 14:30:14 -0600
Hi Jim,
> NCEP has another code using NETCDF that is writing twice and reading once.
> RTOFS is using nf90_create to open the files. How do I set the the NOFILL
> option when nf90_create is used to open the file?
Here's the documentation for the NF90_SET_FILL function:
http://www.unidata.ucar.edu/netcdf/docs/netcdf-f90.html#NF90_005fSET_005fFILL
--Russ
> Jim Abeles
> Certified IT Specialist
> IBM Systems and Technology Group
> Technical Computing: Worldwide Weather Solutions
> 301 879-3283
> address@hidden
>
>
>
>
> From: "Unidata netCDF Support" <address@hidden>
> To: address@hidden,
> Cc: address@hidden, address@hidden, James
> Abeles/Bethesda/IBM@IBMUS, address@hidden
> Date: 02/14/2014 11:58 AM
> Subject: [netCDF #IJQ-160428]: NetCDF no-fill option
>
>
>
> Hi John, Dave,
>
> > The version NCEP is using is v3.6.3, so the issue here:
> >
> >
> http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2011/msg00137.html
>
> >
> > would certainly be a concern; but perhaps not for WRF, since WRF always
> writes all dimensions of a variable in order and in ascending index order
> (referring to necessary condition #2 on the web page). Do you think that
> means we're safe from this bug when using NOFILL?
>
> Yes. However, many other improvements and bug fixes have been made to
> netCDF, including netCDF-3,
> since version 3.6.3 was last released in June 2008, so I would recommend
> considering upgrading to a
> more current, supported version built with --disable-netcdf-4 to get the
> benefits of the enhancements
> and bug fixes. Using the --disable-netcdf-4 configure option means it's
> not necessary to build the HDF5
> library. There have also been improvements and bug fixes to the
> netCDF-Fortran libraries since 3.6.3,
> which would require upgrading the netCDF-C library, because the
> netCDF-Fortran software is now a
> separately developed and maintained software package. Upgrading from
> 3.6.3 is something you could
> consider at any time in the future without worrying about backward
> compatibility, since more recent
> versions continue to support all the APIs and formats supported by 3.6.3,
> with added benefits for
> netCDF-3 users, including improved DAP access and easy transition to the
> netCDF-4 classic model
> format that supports compression and chunking.
>
> > Regarding the condition mentioned in your note, Russ, I there's no
> guarantee that WRF will write variables in the order in which they're
> defined, though it typically does. My interpretation of the statement in
> your note is that there's a potential problem if the writing is
> interrupted (say, by program failure?) but okay otherwise. I don't think
> we'd ever trust a WRF history file that was written at the end of an
> abended run anyway, so we're probably okay using NOFILL on that score too.
> Do you agree?
>
> Yes.
>
> > Related issue, Russ: Jim Abeles (IBM, CC'd here) also mentioned he was
> seeing NetCDF read of a few MB for each large write operation. Can you
> say what is causing that, and if it's necessary?
>
> When appending data to a record variable, it's necessary to read the last
> disk block in the record in
> order to keep the appended record contiguous with the previous record
> (modulo 4-byte alignment).
> Also, if the number of records is increased by a write, it's necessary to
> update that number in the
> header of a netCDF file, which requires reading and rewriting the count in
> the first disk block of the
> file.
>
> You might avoid the read of the last record disk block if you could
> arrange that each record contains
> a whole number of disk-blocks. But that would be non-portable, as the
> physical disk block size
> varies from platform to platform. However, it might be possible by
> padding each record with "data"
> of a size computed when the disk block size is known, at the beginning of
> a model run.
>
> > A clarification: the issue for NCEP's operational use of NetCDF in WRF
> isn't runtime. WRF is writing through NetCDF asynchronously, so the speed
> of the writes isn't usually a factor in model run time. The issue is
> contention for limited system I/O bandwidth on the NCEP operational
> clusters. There are a lot of I/O intensive jobs running at the same time
> and having the WRF jobs generate 2x as much output traffic as needed has
> the effect of further saturating the I/O system.
>
> Understood.
>
> > Dave, it is possible to add the NOFILL write mode to the WRFV3.6 netcdf
> interface as an option that could be controlled at run time with a
> namelist variable, then users would have a choice whether to use FILL or
> NOFILL. I could do that. It would be up to the Developers Committee
> whether to make the default FILL or NOFILL in WRF, but either way that
> would serve NCEP's needs (assuming NOFILL is safe for their use-cases).
>
> It might be worthwhile to measure the difference in the context of
> operational use, to make sure
> using NOFILL really makes a significant difference in delaying or avoiding
> saturating the I/O system.
>
> --Russ
>
> > Thanks,
> >
> > John
> >
> >
> > -----Original Message-----
> > From: Unidata netCDF Support [mailto:address@hidden]
> > Sent: Friday, February 14, 2014 6:26 AM
> > To: address@hidden
> > Cc: address@hidden; address@hidden;
> address@hidden
> > Subject: [netCDF #IJQ-160428]: NetCDF no-fill option
> >
> > Hi Dave,
> >
> > > I am one of the group that provides support to the WRF model. One of
> the most used data formats for the WRF model is NetCDF, largely because of
> the huge selection of post-processing tools that easily allows diagnostics
> and visualization. The WRF model is used in production at NOAA, and as
> such the operations staff are always quite sensitive to the amount of time
> any model takes. A simple method to reduce the wall-clock time is taken
> seriously.
> > >
> > > The NOAA developers have asked a couple of questions about the FILL /
> NOFILL option (here we reference this UNIDATA page):
> > >
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/nc_005fset_005ffill.html
>
> > >
> > > In our model history volumes, if we only open the files for write,
> then the NOFILL option seems to be (not only) an OK choice, but perhaps
> one that provides a timing reduction. Our tests indicate that a write
> with the NOFILL option is a 1x write, compared to a 2x write + 1x read
> when doing the FILL option. Would you please verify this for us? We
> think that we would like to set the NOFILL option as the default option in
> WRF, to benefit our entire community. Would you please comment on our
> possible plans in this regard to help us know if this is the right move.
> All of our data has TIME as the unlimited dimension. Other than the HDF5
> compression capabilities from NETCDF4, we are a pretty vanilla user group
> of NetCDF.
> >
> > I agree that setting the NOFILL option will save time in writing your
> model history volumes,
> > and would be a reasonable default. The issue is whether the performance
> benefit outweighs
> > any problems that might occur in determining what data was written and
> what data are
> > essentially garbage values in case you write the history data variables
> in a different order
> > used to define the variables in creating the file. For example, if you
> write the last record
> > variable first and the writing is interrupted, then the values of the
> previous record variables
> > in that record could be arbitrary. But if you log the creation of
> history volumes so you know
> > when the writing is complete and buffers are flushed, either by a close
> or sync call, that seems
> > like it shouldn't be a problem.
> >
> > One other issue is a potentially serious bug related to use of nofill
> mode and a specific pattern
> > of writing data in versions of netCDF before version 4.1.3 (released in
> June 2011):
> >
> > [netcdfgroup] Important: potential file corruption using NOFILL mode
> >
> http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2011/msg00137.html
>
> >
> > If you're sure that WRF users use version 4.1.3 or greater of the netCDF
> C library, or that
> > WRF doesn't meet the conditions described above that trigger the bug,
> then I think it's safe.
> >
> > There was also a less serious bug with use of fill mode in netCDF
> versions before 4.2,
> > released in May 2012:
> >
> > Fixed turning off fill values in HDF5 layers when NOFILL mode is set in
> netCDF-4 API
> > https://bugtracking.unidata.ucar.edu/browse/NCF-151
> >
> > > Secondly, we are interested in knowing about the statement:
> > > The use of this feature may not be available (or even needed) in
> future releases. Programmers are cautioned against heavy reliance upon
> this feature.
> > > If we get timing reductions, we would like to rely on the continuing
> availability of this feature.
> >
> > I think that statement was overly cautious, because we didn't know
> whether we could support
> > nofill mode in future versions. Now we're committed to backwards
> compatibility for such
> > features, and I don't anticipate that support for nofill mode would be
> dropped. The HDF5
> > software layer used by netCDF-4 also supports fill values and
> >
> > > Thanks in advance for your time and explanations.
> >
> > You're welcome, I hope this helps.
> >
> > --Russ
> >
> > > Dave
> > >
> > >
> > > David Gill
> > > Software Engineer
> > > address@hidden
> > > office: (303) 497 8162
> > > fax: (303) 497-8171
> > > Mailing address:
> > > 3450 Mitchell Lane
> > > Boulder, CO 80301
> > >
> > >
> > >
> > >
> > Russ Rew UCAR Unidata Program
> > address@hidden http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: IJQ-160428
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
> >
>
> Russ Rew UCAR Unidata Program
> address@hidden http://www.unidata.ucar.edu
>
>
>
> Ticket Details
> ===================
> Ticket ID: IJQ-160428
> Department: Support netCDF
> Priority: Normal
> Status: Closed
>
>
>
>
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu
Ticket Details
===================
Ticket ID: IJQ-160428
Department: Support netCDF
Priority: Normal
Status: Closed