This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Rick, > Howdy, I'd like to make certain an application here in SCD that uses > netCDF can properly take advantage of LFS and netCDF's support of it. > > I know the steps I have to follow with regard to compilation/etc. for my > application to use the LFS API. The situation with LFS is about to change with version 3.6.0-beta, and there will be an announcement about it soon, but here's a summary. The original netCDF format (known by it's file "magic number" as CDF1) has a 32-bit size and a 32-bit file offset for each variable, which limits the size of files even when compiled with LFS support. Basically, the size of the last fixed-size variable or the last record-size variable is unconstrained, as long as the offset from the beginning of the file is less than 2^31. There are examples of how you can exploit this to write terabyte netCDF files here: http://my.unidata.ucar.edu/content/software/netcdf/f90/documentation/f90-html-docs/guide9.html#2236524 but the limitations of permitting only a single large fixed-size variable or multiple large record variables are fairly constraining. With 3.6.0, we're introducing the first new format for netCDF access since 1988, changing the 32-bit file offsets to 64-bit offsets with some code contributed by Greg Sjaardema of Sandia. The new file "magic number" will be 'C' 'D' 'F' '\002', and the library will still read and write CDF1 files by default. However if a user creates a file with the NC_64BIT_OFFSET flag (or equivalent for the Fortran or C++, or Java interfaces), the new format with 64-bit offsets will be used. Assuming the library is compiled with LFS support, this eliminates many of the constraints for creating large netCDF files. The remaining rules are: - The size of each fixed-size variable except the last fixed size variable has to be strictly less than 2**32 = 4294967296 bytes. The last fixed-size variable can be any size supported by the file system, e.g. terabytes. - The size of one record's worth of data for any record variable except for the last record variable in each record must also be strictly less than 2**32 bytes. The size of one record's worth of data for the last record variable is unconstrained except by the file system. So in particular, you'll be able to have as many 4 Gbyte fixed-size variables as you want, with the last variable even larger. > Could you please pass along some hints/pointers as to how the netCDF > goes about dealing with large files (stat()'ing, open()'ing, etc)? I'll > start poking around to see what I find, but figure (at least one of) you > could send me to the exact spot to look, thanks! I don't think there is any difference in the way large files are handled in terms of stat()'ing, open()'ing, etc. since this all "just works" for large files when compiled with LFS support. I'm currently working on some needed additions to netCDF 3.6.0 to return errors in case the rules on sizes of variables are violated. In case you're wondering, the size of a variable is a size_t, which is still 32 bits on most systems even when compiled with LFS, and that's why there are still constraints on maximum variable sizes. In a year when netCDF-4 is available, we may even be able to eliminate these constraints with the HDF5-based file format that will be supported in netCDF-4 (with full backward compatibility for the current formats, of course ...). _____________________________________________________________________ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu/staff/russ