This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: Nils Olav Handegard <address@hidden> >Subject: GRIB2NetCDF, filesize, offset and scaling >Organization: ? >Keywords: 200007190747.e6J7lJT22878 Hi Nils, > I've managed to implement software to convert NCEP-GRIB 6h reanalysis > data to daily mean, stored in NetCDF (I've used lats4d and grads). > > Most of these data (daily average) are already available in the NetCDF > format via ftp, but some of them have to be converted. > > The problem is that the files produced by NCEP is half the size, they > use the representation 'short' with 'add_offset' and 'scale_factor' and > my files use the 'float' representation. Is there anything to do about > this? Is there any tools for converting this? I didn't find anything in > the grads/lats package to set this (I'm sure there is, but I don't know > the tools that well). Using the 'add_offset' and 'scale_factor' conventions for packing floating-point data in 16-bit shorts is a common way to save space, but the netCDF library doesn't provide any automatic conversions that uses these attributes. Some software I've seen that follows these conventions includes the FAN library described at http://www.unidata.ucar.edu/packages/netcdf/fan_utils.html which says in the FAN User's Guide: Scaling and Unit Conversion All netCDF input and output values are transformed by a linear equation defined by the attributes add_offset, scale_factor and units; together with any unit defined by the -u option mentioned above. The output units attribute is defined or modified in some situations such as when it is undefined but the corresponding input attribute is defined. ... I don't know whether you can just use the FAN utilities directly, or whether it might be easier to extract the necessary conversions functions from fanlib, the supporting C library. In any case, the software is available from ftp://ftp.unidata.ucar.edu/pub/netcdf/contrib/fan.tar.Z although it's "user-contributed" software that we don't support directly, and the author, Harvey Davies, is no longer actively supporting it either. Another possibility is some other user-contributed software, nc_float.c: Harry Edmon's interfaces to ncvarget and ncvarput that convert (optionally packed) data to floating point, handling missing data and units conversions. available from the catalog of user-contributed software at http://www.unidata.ucar.edu/packages/netcdf/contrib.html though it's pretty old and may need some updating to netCDF-3. There may be other utilities that handle the packing attributes, but I'm not aware of them. I've appended an excerpt from an earlier reply that has some example code for packing data, but you may already know this. If you find or develop a data access layer that handles this 2:1 packing, please let us know ... --Russ _____________________________________________________________________ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu ... The netCDF library doesn't treat these attributes in any special way, so you have to use their values for packing before you write values and unpacking after you read values. As an example, if you want to pack floating-point values between 950 and 1050 into 8-bit bytes for a program variable named `x' that is to be stored into a netCDF variable named x_packed, the structure of the netCDF file might include a data specification like the following: variables: ... byte x_packed(n); x_packed:scale_factor = 0.3937; x_packed:add_offset = 950; x_packed:_fillValue = 255; ... where we just use the minimum value, 950, for the offset to keep all packed values positive, and we compute the scale factor by using scale_factor = (Max - Min)/(2^Nbits - 2) = (1050 - 950) / (256-2) = 0.39370079 Now before you store the value x, you pack it with the formula: x_packed = (x - add_offset) / scale_factor and you store the byte value x_packed (which will be between 0 and 254) instead. You can use the byte value 255 for a missing value. Similarly, when you read the data back in, you can unpack it using the formula: x = (x_packed - 1)*scale_factor + add_offset If you need more than 8-bits of precision but you still want to each value as one netCDF value, you will have to use 16-bit shorts, and then the formula above will use Nbits = 16 instead of Nbits = 8. If you are using C, you may have to declare x_packed to be an `unsigned char' to get these formulas to work out, or change the formulas to assume signed values. In Fortran there are no unsigned integers, so change the formulas to use signed integers instead.