This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Gary, et al, a few comments on the cdl's, and more inline Generally these look quite good. in consensus.cdl: 1. Will there always be 50 heights for every time? That is, is this really a max, and do you expect a lot of missing values? 2. use of valid_range: float latitude ; latitude:long_name = "Site Latitude" ; latitude:standard_name = "latitude" ; latitude:units = "degrees_north" ; latitude:_CoordinateAxisType = "Lat" ; latitude:axis = "Y" ; latitude:valid_range = -90.f, 90.f ; A minor point, but presumably latitude will always have a valid value. IMO, valid_range should be used only when there is a possibility that the value might be out of range, and should be considered missing. It matters ( not really here, but for larger data variables) because software has to spend time checking if its out of range. For example: float w_classicL_conf(time, height) ; w_classicL_conf:long_name = "W wind (classic/linear) confidence" ; w_classicL_conf:valid_range = 0.f, 1.f ; w_classicL_conf:_FillValue = -9999.f ; w_classicL_conf:coordinates = "height longitude latitude";The _FillValue is probably sufficient. If you want the software to also check if values are within the valid range, then leave valid_range in, otherwise take it out for efficiency. I assume you just want to document, so I would suggest just using a different attribute name.
Gary Granger wrote:
Hi Cory, Don, and Bill: Don, here's some background info: Cory is working on generating netcdf files from NIMA for many of the kinds of profiler data we have. So I'm thinking this is a good time to straighten out our netcdf conventions as best we can. Cory has much of the CDL already specified, and below I suggest some changes and ask some questions. We've already incorporated the changes you suggested to me back at T-REX. So I invite your feedback so that we can make our data as compatible as possible with Unidata tools like IDV and with existing conventions like CF. The CDL files most relevant to Unidata are 'consensus.cdl' and 'rass.cdl', since they contain the derived measurements for winds and virtual temperature. However, it might also be useful if we can display some of the intermediate data in IDV, so we'll try to make the conventions among the files as consistent aspossible. I've cc'd Ethan and John since you included them in the T-REX email. The more eyes the better. I've attached that email for reference.I've attached the CDL files from Cory with many of the changes I suggest below already made. I'm hoping we can come to a quick consensus on the final changes we need to make so that Cory can finish her implementation. Then I will (eventually) also be fixing all of our other profiler software to start following the new conventions.I've been using the CF conventions as my main reference, so I've included links to the relevant parts of that document. I've also compiled all the CDL files into netcdf and passed them through the CF-checker here:http://titania.badc.rl.ac.uk/cgi-bin/cf-checker.pl This gave some useful suggestions which motivate many of my suggestions here. The most common warning is not recognizing units of 'dB' and 'meters^(2/3)/second', but there's nothing we can do about that because those are due to limitations in udunits. [It also warns about MHz, but that's bogus because udunits does recognize those units.] Don, do you have any suggestions for units like dB and meters^(2/3)/second (for eddy dissipation rate)? I had thought to change our 'heights(time,height)' variable to 'height(time,height)' so that height would look like a coordinate variable. However, CF seems to discourage this because multi-dimensional coordinate variables could break COARDS-compliant and NUG-compliant applications. Should we keep 'heights' or use 'height'? I've changed the cdl's to use 'height', but if everyone thinks 'heights' is better we can keep that. Either way we can add the 'coordinates' attribute to variables and set it to 'height[s] longitude latitude'. http://tinyurl.com/326u67
It doesnt matter to IDV/CDM, but theres no reason not to follow CF advice to not use dimension name for multidimensional coordinates. So heights is better. A bigger problem is that heights _is_ a vertical coordinate, as is altitude. How to indicate this?
What units should the unitless confidence variables have? Should we explicitly specify '1' just for clarity? CF-COARDS does not require it, since lack of 'units' implies unitless, but I'd suggest using '1' for completeness.
for unitless quantities like confidence, better to use an empty string units = "", although "1" is ok also. definitely dont omit the attribute units
I changed wdir in consensus.cdl to use units of 'degrees'. Bill, should the consensus files include radar parameters like ncoh and nspec and all the other parameters in spc.cdl? ie, is that kind of configuration info useful enough to be carried forward into the data files that we will likely release to PIs?I changed time long_name to "Time". I'd prefer to store the time values as offsets from thefirst time in the file, and change units to "seconds since <first time in file>". This makes it easy to ncdump a file and see the ascii rep for the first time in the file, and the time values are humanly interpretable, but any software following the udunits convention will be able to parse the units and compute the times. I can see the value in not having to parse the units string in things like IDL scripts, but really my preference would be to make IDL accommodate the best file conventions possible rather than the other way around. As a compromise we can add an "optional" attribute to the time variable like 'units_string_in_unix_seconds', and store the unix seconds there.
Another option is to add another variable with the ISO date/time string. Im planning on making a change request to CF to allow this as a valid time coordinate.
For sample_start_time and sample_end_time, I presume these are the periods of time over which the consensus is computed. It turns out that CF has a convention for this, which is to use a second dimension of length two to store the beginning and end of the coordinate interval. For example, define a single variable sample_times(time, 2), and then store the sample start time in sample_times[time, 0] and the end time in sample_times[time, 1]. Then there is a 'bounds' attribute to 'time' which names sample_times as the variable holding the interval boundaries. So for the sake of following existing practice, I suggest replacing sample_start_time and sample_end_time with sample_times. That means sample_times should not have a _FillValue attribute, because it should never be empty. The same should go for 'time', even though the cf-checker didn't complain about that one. consensus.cdl has an example of this change; rass.cdl and wind.cdl would need the same change.
yes, bounds is good. we actually are parsing it, and the info is available in the CoordinateAxis object.
http://tinyurl.com/38htef Global 'author' attribute: I remember Cory that you'd allow using an environment variable to set the value. In case that variable is not set, then I think the author attribute should be left off instead of having a default, unless the default is empty. I'm a little concerned about unexpected values being set for author because the user's not aware of what's already set in the environment, but I guess I can live with it. As an alternative, or in addition to 'author', the CF convention also mentions 'institution'. Maybe that one can be really general but still helpful, like 'NCAR'.
For global attributes you might want to have a look at: http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html
I added standard_name to a few wind variables: wspd, wvert, wdir. I think we should remove the Data_start_time, Data_end_time, and 'date' global attributes. They are redundant and not completely intuitive; for example, what if we store profiler data across a date boundary? And is the date in local or GMT? Speaking of which, maybe it's worthwhile to store the local timezone in a global attribute, if available. Or maybe we should just use longitude to do things like plot times relative to the local solar noon. Bill? spectraDbs:_FillValue needs to be outside valid_range, so I changed it to -99999.
as noted before, i dont really like valid_range, but i know its in wide use.
All the 'height' variables should have a standard_name attribute of 'height'.However, I have a question about the interpretation height. The 'height' standard_name implies height above the surface, ie, ground, which should be good enough for us. (Does anyone ever include the height of the antenna above the ground in the calculation of gate heights?) And long_name should be more precise if possible, such as 'Height of center of gate' or 'Height above ground to bottom of gate' or whatever it is; I'm not actually sure myself. Of course, 'height' should only apply to true vertical variables, like at least the derived winds and virtual temperatures. For radial measurements (moments, snr, doppler, ...), do we usually store the "height" as the distance along the beam, or is it stored as the actual height above the ground? If the latter, then I guess we can keep the height convention for those variables too, otherwise we need something different. I just wasn't sure. Bill or Cory, can you clarify this for me? Should we store an alternate coordinate variable for the gate altitudes in meters above MSL? Or assume that software can be smart enough to add height to altitude? For example, we could add a 'gate_alt' variable with the gate altitudes pre-computed: variables: float wspd(time, height); wspd:coordinates = "lat lon gate_alt time"; float gate_alt(time, height); gate_alt:standard_name = "altitude"; gate_alt:axis = "Z"; gate_alt:units = "meters"; gate_alt:long_name = "Altitude to Center of Gate"; I assume this would make it easier for more generic software to integrate profiler data at the correct relative heights, but maybe it's excessive.
this is the issue that needs some more thought
Should our lat/lon/alt variables have a single dimension of size 1, to make them COARDS-compliant coordinate variables? CF allows scalar coordinate variables, and they can be associated with a variable using the 'coordinates' attribute, but using the 1-dimensional option could be more universal. http://tinyurl.com/2lrjkt
my own opinion is its a mistake to extrapolate COARDS, which is about grids, to observational data. It works ok as long as you are storing a single profile in the file, but is incorrect when storing multiple profiles in a file. For that case, the correct generalization is a "profile" dimension, and then lat(profile), lon(profile), etc.
So I would advise against it.
Should we store the boundaries for the gate heights, eg height_bounds(time,height,2), so that it's obvious where the gate is and where the height coordinate falls relative to the gate? Or is there an attribute we can specify to indicate that the height coordinates are always at the center of the gate (assuming that's where they are)? http://tinyurl.com/38htef
We will assume that the coordinate is a midpoint, and edges are half-way in between. Use bounds if thats not the case, or you need to convince some other piece of software of the correct interpretation.
Should we identify the Conventions as 'CF-1.0' (assuming we can make everything compliant), or is it still better to have a separate convention 'EOL Profiler Convention 1.0'? http://tinyurl.com/2mfobc
If you can make it CF, you can do :Conventions = "EOL Profiler Convention 1.0, CF-1.0" indicating that it satisfies both. What you have then is "EOL Profiler Convention 1.0" extends "CF-1.0" in some sense.
Thank you! gary ------------------------------------------------------------------------ Subject: Start of a conversation on profiler data From: Don Murray <address@hidden> Date: Fri, 27 Jan 2006 15:19:18 -0700 To: Bill Brown <address@hidden> To: Bill Brown <address@hidden> CC:Gary Granger <address@hidden>, address@hidden, Ethan Davis <address@hidden>Hi Bill- As I mentioned earlier this week, we'd like to work with EOL to come up with a convention for the profiler data that you produce so it will be easier to ingest into IDV and other netCDF programs. So far, the two types of vertical profiler files that we have are the EOL MAPR/RASS/DBS formats and the FSL WPDN format. The main difference between the structure of the files is that your formats are one station for multiple times and the FSL are multiple stations at one time. If you know of others, please let us know (and pointers to samples would be good). I took one of the MAPR files at: http://www.atd.ucar.edu/rtf/projects/srp2004/iss/realtime/data/iss-mapr (mapr040320.windsnc_05) and modified it slightly to make it a little easier to read. The modified file is attached. Basically, I changed the time_offset(time) variable to time(time) and fixed some of the units to be udunits compatible. I also added a few Global attributes that add some information which would be useful down the road: :Conventions = "EOL Profiler Convention 1.0" :latitude_coordinate = "lat"; :longitude_coordinate = "lon"; :zaxis_coordinate = "alt"; :time_coordinate = "time"; The Conventions tag would be a way of describing the format and having a version number would help as it evolves (e.g., corrections, support for new sensors/parameters). The *_coordinate variables allow one to name variables whatever they want, but define them in canonical terms. This was taken from the Unidata Observation Dataset Convention: http://www.unidata.ucar.edu/software/netcdf-java/formats/UnidataObsConvention.html (under the "Identifying the Coordinate Variables" section). For T-REX, I understand you not wanting to make changes that could create problems, but if you could fix the units, I could create a reader for the IDV pretty easily. If you could change time_offset to time, that would be even better. That change would be more in line with how the RASS files look, so maybe that change could be supported for T-REX. I don't need the global attributes for T-REX, but it's something to consider for the future. Another help for the MAPR files would be to put a .nc extension on them, but that's not critical. I've cc'd John Caron and Ethan Davis who are the netCDF experts and Gary Granger who's working with you on the EOL end. I'll be out of town next week, but I wanted to get this conversation started. If you have any questions about the changes, let me know. Don ************************************************************* Don Murray UCAR Unidata Program address@hidden P.O. Box 3000 (303) 497-8628 Boulder, CO 80307 http://www.unidata.ucar.edu/staff/donm "Time makes everyone interesting, even YOU!" *************************************************************