This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Harvey, > I am in process of doing some of the simpler tasks on my FAN 'to-do' list. > One I added to the list while over there was support of the 'signedness' > attribute. > > 'Signedness' is a horrible word (although it passes the standard unix > 'spell') for a horrible kludge. And the more I looked into it the > horribler it became. :-( A weak defense: it's arguably better than the alternative, 'signosity'! :-) > The user guide entry on 'signedness' (p21) states that bytes default to > unsigned. But this is inconsistent with following in netcdf.h: > > #define FILL_BYTE ((char)-127) /* Largest Negative value */ You're the first to point out this discrepancy, which I hadn't noticed. > Incidently the Largest Negative value is -128 although it may be better not > to use this (just as it is best not to use min short of -32768 because it > can be difficult to print, etc.) I vaguely recollect we were thinking of platforms that use 1's-complement rather than 2's-complement representations for negative integers; such machines have a -0 that is represented differently from 0 and have one fewer negative values as a result. But now that I think about it, there may not be any 8-bit byte machines that still use 1's-complement representations. The only 1's complement machine I'm familiar with is the Control Data 6x00/7x00 that had 6-bit bytes and 60-bit words. So if there are no such machines around any more, then you're right and FILL_BYTE should have been -128. Similarly, I think the FILL_SHORT should have been defined as -32768 rather than -32767, and the FILL_LONG should have been defined as -2147483648 rather than -2147483647, but it would break things to correct these now. Incidentally, I don't see any problems printing these constants, when using the right printf formats. > My fan system treats bytes as signed. > > All the current fill values are unsuitable for unsigned case. The obvious > unsigned fill value is the max e.g. 255 for unsigned byte. I agree. > I thought about using sign of nc_type to represent signedness. So unsigned > short would have nc_type of -3. But this would end up being half-hearted > kludgy representation of unsigned types. My feeling is that if unsigned is > needed then it should be done properly by implementing proper full-blown new > unsigned types. Perhaps we could implement signed & unsigned 64-bit integers > at the same time. The biggest obstacle is the Fortran interface. Although there are (nonstandard but almost universally available) ways to declare bytes and shorts in Fortran, I don't know of any way (even in Fortran 90) to declare unsigned bytes or unsigned shorts. So into what type of variable would a Fortran program read an array of unsigned shorts? It's currently not possible to create a netCDF file from C that you can't access from Fortran, and I think that is a desirable characteristic of netCDF. We should add support for 64-bit integers (called long longs or hyperlongs in the XDR documentation) when the vendor's XDR libraries include the functions xdr_hyper or xdr_longlong_t. I'm not sure what the current availability of these is, except they seem to be available on Solaris 2.4, but not on SunOS 4.1.4. > In short term I suggest deprecating 'signedness' attribute in user guide > (since it will be redundant when proper unsigned types are implemented) & > stating that meanwhile, bytes default to SIGNED like other integers. Perhaps > we should first post an item to the netcdfgroup asking if anyone uses the > attribute because we are considering deprecating it. So far, the netCDF C library doesn't care whether bytes are signed or unsigned, because no library calls ever do any arithmetic on the data. The purpose of the attribute was to permit communicating the intent of the data provider to data consumers or applications. It would be better to do this with signed or unsigned types, but then we have to figure out how to handle the Fortran interface. Our future plans for packed data (e.g. an array of 10-bit values) will store information in something like an _Offset attribute that will imply whether the unpacked data is signed or not. Complicating the issue is the C++ implementation, that already uses typedef unsigned char ncbyte; for data of type NC_BYTE and provides conversions on values for both variables and attributes with member functions like the following: // The following member functions provide conversions from the value // type to a desired basic type. If the value is out of range, // the default "fill-value" for the appropriate type is returned. virtual ncbyte as_ncbyte( int n ) const; // nth value as an unsgnd char virtual char as_char( int n ) const; // nth value as char virtual short as_short( int n ) const; // nth value as short virtual long as_long( int n ) const; // nth value as long virtual float as_float( int n ) const; // nth value as floating-point virtual double as_double( int n ) const; // nth value as double virtual char* as_string( int n ) const; // nth value as string Although the C++ interface was declared experimental and subject to change, changing its assumption that bytes are unsigned will definitely break some programs, though perhaps not as many as changing the definition of FILL_BYTE from -127 to 255. Of course we urged people in the Users Guide to use their own fill values appropriate to the data rather than accepting the default fill values. The long and short of it :-) is that we have a problem. Thanks for pointing this out. > I intend to continue to ignore the 'signedness' attribute in my FAN code. The C++ interface and ncdump also ignore the 'signedness' attribute, so I think it would be OK to deprecate it. And the XDR underpinnings support unsigned values for all the integer types. If there were a way to handle the unsigned types in Fortran, perhaps with some extra functions that converted signed integer values to unsigned for the various netCDF types, I would agree that we should "byte the bullet" and add unsigned types. In that case, it would be best if NC_BYTE were interpreted as signed also, even in the C++ interface, since there would be an NC_UBYTE type for unsigned bytes. ______________________________________________________________________________ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu