This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Rob, > Over at ANL we've been testing our proposed "CDF-5" file format for > a while and now it's time for me to get serious about porting that > work to NetCDF. > > One thing I've noticed is that NetCDF4 has relaxed variable size > limitations, but still addresses those variables with a size_t type: > > int nc_put_vara_float (int ncid, int varid, > const size_t start[], const size_t count[], > const float *fp); > > What happens in NetCDF-4 if someone wants to create a 1D variable of 5 > GB ? They would have to do that on a platform on which size_t is larger than 5 GB i.e. a 64-bit platform. In that case, there's no problem, as the size_t is typically a 64-bit unsigned quantity. The resulting data would not be readable on a 32-bit platform, which means the data would only be portable to other 64-bit platforms. This is an unfortunate disadvantage of supporting larger dimension sizes. The 64-bit offset format of netCDF-3 still restricts each dimension size to be 32 bits, but the netCDF-4 API and format have no such restriction, as far as I know. > In Parallel-NetCDF, we use an MPI_Offset type (a 64 bit type on all > but the most ancient of platforms) in our API. The prototype for > ncmpi_put_vara_float, for example looks like > > int ncmpi_put_vara_float(int ncid, int varid, > const MPI_Offset start[], const MPI_Offset count[], > const float *op); > > Note that I'm not speaking about the 'count' parameter here -- that's > an entire other kettle of fish. I just mean how to describe the start > of an access to a variable with one or more very large dimensions. > Heck, for this example, count[] can be all 1s. > > I think this example is not contrived, as the FLASH group has a > workload where they track 4+ billion "things", and so would naturally > use a variable with a 4+ billion dimension. I'm not understanding the advantage of MPI_Offset over size_t for the type of the count array. For 32-bit platforms, you wouldn't want a 64-bit type for count, because no object such as an array can be indexed by anything bigger than a size_t, which is defined as the size of an object. For 64-bit platforms, size_t seems ample as a type for the count[] array. Am I misinterpreting your question? --Russ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: FFY-177157 Department: Support netCDF Priority: Normal Status: Closed