This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
John Caron wrote the following on 8/16/2006 3:39 PM: > Hi Dan: > > dan.swank wrote: > >> This will be a challenge for sure. >> The NARR, for example, will be an aggregation of ~75000 grib files. >> Stored in a basic ./YYYYMM/YYYYMMDD tree. The recursive datasetScan >> tag added recently helps a ton with this. Some of our datasets have >> forecast hours, some don't. Doing n forecast hour aggregation across >> the 00hr will help termendously with all of them, however. >> While it works wonderfully for NetCDF, I cannot see the NcML agg. >> working with this set of data ~ >> mainly due to the changing reference times. >> >> > I think the FMRC will probably solve it. However, a 75,000 file > aggregation will be a challenge. Im actually pretty sure we can solve it > (with enough server memory!) but it does worry me that with a single > dods call, someone could make a request that requires opening 75,0000 > files to satisfy. OTOH, if thats the service you want to provide, it > sure is a lot better doing it on the server!!! Any thoughts? Throttles... If the dev team could create an element to specify the maximum size of a request in either bytes returned or number of files accessed, that would be great. > > Looking at the NARR data: > - it looks like you have them divided by day, then all for the same month. > - it looks like all the time coordinates are either 0 or 3 hour offsets > from run time. The NARR is a reanalysis, as it contains variables defined at instantaneos initial time, or a 0 to 3 hour average/total/ or other operation. > - whats the difference bewteen narr_a and narr_b? Should they be > combined or kept seperate? The differences are explained here: http://nomads.ncdc.noaa.gov/data.php?name=narrdiffs > - i assume new files are added now and then? how often? ever deleted? New NARR comes in from NCEP on an irregular basis. Typeically, this is on a once a month or less frequency. This archive is set to grow indefinately, the files are never deleted. > >> According to NCEP, our NAM & GFS will soon be foreced into GRIB2. >> But NCDC-NOMADS NWP it currently entirely a GRIB-1 archive. >> Only recently home-grown NCDC datasets are created in NetCDF. >> >> For NAM & GFS, we have about 6 months online, which comes out to >> about 700 file when stripped to a 1 forecast time >> (say 00hr) aggregation. But there are 61 forecast times for GFS, and 21 >> for NAM. >> >> > Do you store each hour seperately, or are all the forecast hours for a > run in the same file? We store them in a one file per forecast hour, which contains all parameters and vertical levels for that forecast hour. -Dan