[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Unidata Support: 951213: More netCDF-2.4-beta5 test results
- Subject: Unidata Support: 951213: More netCDF-2.4-beta5 test results
- Date: Wed, 13 Dec 1995 10:20:20 -0700
Jeff,
I'm forwarding the note below from John Sheldon at GFDL, which contains more
results from testing Cray optimizations. He's using "nctime.c", which is a
little stand-alone netCDF benchmarking program I wrote a couple of years ago
that is available from
ftp://ftp.unidata.ucar.edu/pub/netcdf/nctime.c
I intend to see if I can reproduce Sheldon's results on shavano, but I
probably won't get to it until tomorrow or Friday.
I have a few questions:
1. In your current position, do you have any time to look at this (and the
NCFILL/NCNOFILL results) and respond to Sheldon's questions? If not,
do you know of anyone else with enough Cray expertise to investigate or
explain the results from Sheldon's tests?
2. It's possible that some of the results Sheldon is seeing are due to the
way we integrated your optimizations into our release. Is there still
a copy of your Cray library around that just has your Cray
optimizations to the previous netCDF 2.3.2 release, without any other
changes we've made for 2.4? If so, I'd like to link against that and
run the tests on shavano with that version too.
3. Are the benchmarks done by nctime too artificial or variable to be
useful? It takes a four-dimensional slab of specified size and times
writing it all out with ncvarput as well as reading back in all 16
kinds of cross sections. It does this for all six types of netCDF
data. Previously I've noted that unless a local file system is used,
NFS caching may get in the way of consistent results. Results also may
be very dependent on sizes used and may vary from run to run for other
reasons that are difficult to control.
Thanks for any light you can shed on this!
--Russ
- ------- Forwarded Message
>From: address@hidden (John Sheldon)
>Organization: GFDL
>Keywords: 199512121934.AA29804 netCDF CRAY
Hi again-
In my continuing timing tests on our C90, I noticed a potentially
serious problem with ver.2.4. I used "nctime" with both the 2.3.2 and
2.4-beta5 libraries and got the following:
2.3.2 :
- - ----- float_var(12,18,37,73)
time for ncvarput 12x18x37x73 2026.276 msec
time for ncvarget 1x1x1x1 0.048 msec
time for ncvarget 12x1x1x1 5.874 msec
time for ncvarget 1x18x1x1 1.303 msec
time for ncvarget 1x1x37x1 1.579 msec
time for ncvarget 1x1x1x73 0.400 msec
time for ncvarget 12x18x1x1 14.185 msec
time for ncvarget 12x1x37x1 12.739 msec
time for ncvarget 12x1x1x73 9.585 msec
time for ncvarget 1x18x37x1 11.279 msec* <----
time for ncvarget 1x18x1x73 7.430 msec
time for ncvarget 1x1x37x73 12.867 msec*
time for ncvarget 12x18x37x1 102.484 msec
time for ncvarget 12x18x1x73 62.653 msec
time for ncvarget 12x1x37x73 115.803 msec
time for ncvarget 1x18x37x73 162.005 msec*
time for ncvarget 12x18x37x73 1939.247 msec
2.4-beta5 :
- - ----- float_var(12,18,37,73)
time for ncvarput 12x18x37x73 15.825 msec
time for ncvarget 1x1x1x1 2.729 msec
time for ncvarget 12x1x1x1 22.667 msec
time for ncvarget 1x18x1x1 32.672 msec
time for ncvarget 1x1x37x1 54.518 msec
time for ncvarget 1x1x1x73 2.177 msec
time for ncvarget 12x18x1x1 342.961 msec
time for ncvarget 12x1x37x1 701.648 msec
time for ncvarget 12x1x1x73 22.740 msec
time for ncvarget 1x18x37x1 1011.911 msec* <---- !!! x92 more ! PROBLEM!!
time for ncvarget 1x18x1x73 32.716 msec
time for ncvarget 1x1x37x73 2.257 msec* <---- x6 less
time for ncvarget 12x18x37x1 12542.287 msec
time for ncvarget 12x18x1x73 341.818 msec
time for ncvarget 12x1x37x73 22.605 msec
time for ncvarget 1x18x37x73 3.594 msec* <---- x50 less
time for ncvarget 12x18x37x73 38.172 msec
While many of the accesses posted much better times, the X-Z slab
accesses were 100 times slower! If I wanted to step thru X-Z slabs of
the results from a 1-degree model run (180 y-points), it will take 3
minutes where it used to take 2 seconds!
Now, that's on the "read" end. On the "write" end, I wrote my own
small test program (with NCNOFILL!) which showed that the times are
substantially _less_ using version 2.4 : user-CP sec=~1.2 vs 21 for
version 2.3.2, and wall clock time of 1:30 vs 13-26(!) mintues (I did a
couple of runs) for version 2.3.2. I don't understand why writing
goes so much faster with version 2.4, but reading goes so much slower.
Any ideas?
Hope this helps, even if it's not necessarily welcome news...
John
address@hidden
- ------- End of Forwarded Message
------- End of Forwarded Message