This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Tomas, I'm sorry to have taken so long to answer your question about the size of netCDF files produced by the C and C++ interface. Investigating the problem revealed a bug that will be fixed in the next release. The explanation is that the NcVar::add_att( NcToken attname, const char* ) function invoked in netcdf/c++/example.cc, used to define string attributes, as in P->add_att("units", "hectopascals"); stores the string attributes with the trailing "\0" character counted as part of the attribute value. Here's the relevant code from the NcVar::add_att member function in netcdf/c++/netcdf.cc: if (ncattput(the_file->id(), the_id, aname, (nc_type) ncChar, strlen(val) + 1, val) == ncBad) The C version in example.c provides explicit lengths, and doesn't include the trailing "\0" character, for example: ncattput (ncid, P_id, "units", NC_CHAR, 12, (void *)"hectopascals"); Hence the C++ version is storing an extra character, the trailing "\0", for every string attribute. When ncdump reads and prints a string attribute, it doesn't include any trailing null byte, since that is assumed to be the end-of-string marker from C. Hence ncdump will print exactly the same attribute value for a four-character attribute value "abc\0" as it will for a three-character attribute value "abc". I think the behavior of ncdump is OK in this respect, although it means running ncdump and then ncgen on a file containing attributes with trailing nulls will strip the trailing nulls, so the resulting file will be smaller than the original. The NetCDF User's Guide recommends: In C, fixed-size strings may be written to a netCDF file without the terminating null byte, to save space. Variable-length strings should be written @emph{with} a terminating null byte so that the intended length of the string can be determined when it is later read. ... In FORTRAN, fixed-size strings may be written to a netCDF file without a terminating character, to save space. Variable-length strings should follow the C convention of writing strings with a terminating null byte so that the intended length of the string can be determined when it is later read by either C or FORTRAN programs. so it does not require the terminating null byte. I can fix the inconsistency you have uncovered in either of two ways: 1. Change the c++/example.c code so that it includes the trailing null byte in the attribute length for all string attributes. 2. Change the code for NcVar::add_att( NcToken attname, const char* ) so that it doesn't store the trailing null byte. I prefer the second fix, but in trying it, I just noticed it requires a rewrite of the NcValues_char::print(ostream&) member function in ncvalues.cc. I've added that to my list of things to do before the alpha-test version of netCDF 2.4 is ready. Anyway, thanks for being persistent in asking about this problem, even though I was apparently ignoring it the first time you asked. You have uncovered a bug that we will fix. --Russ ______________________________________________________________________________ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu