This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
> To: address@hidden > From: address@hidden (John Sheldon) > Subject: netCDF 2.4.3 prefill on Cray > Organization: Princeton/GFDL > Keywords: 199611160003.AA04557 > I finally got around to testing out 2.4.3 on our Cray IEEE T90. > Unfortunately, I did not see any improvement in the speed of pre-fill. (Jokeing tone of voice) As far as I know, performance improvement for prefill on Cray T90 wasn't in the specs for netcdf-2.4.3, so I don't know anyone would be expecting it. As I mention at the end, those sort of improvements are in netcdf-3. Without doing any profiling, I would guess that the reason that 2.4.3 performance difference between fill and no_fill due to the fact that the "source" array for fill data is very small, and the prefill loop isn't as smart as the varput loop. The prefill loop is something like: for(ii = current_number_of_records; ii < current_number_of_records + new_records; ii++) { NCfillrecord(..., ii); } The function fill_record(..., recnum) steps through the "record" variables (in your example case, just the one) and calls xdr_NC_fill() to fill the the space for that particular variable. The function xdr_NC_fill() just loops thru a single record's data, converting from a small (in your case 2 value) source array. In contrast, for a varput(), you are handing it a source array with all the values, so the record fill is completely parallel ? All is not lost, however. It looks to me like someone ("SWANSON") has done some work on improving this for other cray architectures, see cdf.c: xdr_NC_fill(). It looks like he was being conservative in his changes here. I believe if you change line 426 of cdf.c: #if !defined(_CRAY) || defined(_CRAYMPP) || defined(_CRAYIEEE) to #if !defined(_CRAY) your architecture would benefit from this work. (In your case, the call to xdr_floats() would end up being a copy rather than a conversion.) Give that a try and see if things improve. Note that it is really a bit of a hack, it only fixes 'float' values. In netcdf-3, all architectures and all types benefit from this sort of thing. There is a compile time tuning parameter, NFILL (in putget.c) which controls the space vs time tradeoff of fill buffer size vs looping for fill. NC_PG_CHUNK/sizeof(double). NC_PG_CHUNK is another compile time tuning parameter, defined in nc.h, which controls the same sort of tradeoff for all i/o. On systems with lots o' memory, NC_PG_CHUNK could be increased well beyond it's default value of 16384. Hope this helps. -glenn