This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
> Organization: NCAR/CGD > Keywords: 199408280439.AA01442 netCDF Cray > From: address@hidden (Phil Rasch) Hi Phil, > Over the last year or so I have seen 5 or 10 messages within the > discussion group lamenting the inefficiencies of using the XDR package > on the crays to do the ieee conversions within the netcdf package, and > a few proposed solutions. Unfortunately, I have not yet seen anyone > advertise a real IMPLEMENTED solution. I am now being bitten > personally by this problem. Before I delve into it myself, or put my > programmer on it, I thought I would ask your advice. So my questions > are: > > 1) Do you know somebody who has implemented a fast way of writing > floats on the cray under netcdf? On the Cray 3, yes; but we know of no complete implemented solution on other Crays. I have appended notes about the Cray 3 solution as part of a mail exchange with Chris Anderson from NERSC. > 2) If not, why not? Is it a really hard problem? Where are the > bottlenecks? I've wondered the same thing. It seems like it should be fairly easy to replace the code that accesses an array of floats using a single call per floating-point value of xdrfloat() by a call to some vectorized Cray library function that converts a whole array of values at once. I think part of the problem is that you have to violate the natural layering of the XDR library, since it provides a special-purpose call for an array of bytes but no such call for an array of floats. I believe that the function-call overhead of one call per float is what's consuming most of the CPU time, rather than the conversion to and from IEEE floating-point representation. > 3) If there arent any, where do I start (ie can you point me to the > low level routine I have to modify)? Hopefully, you could save me a > day or so of poking around with a sentence or two. The appended description of what was done on the Cray 3 may be of some help. -- Russ Rew UCAR Unidata Program address@hidden P.O. Box 3000 http://www.unidata.ucar.edu/ Boulder, CO 80307-3000 >From address@hidden Mon May 2 09:18:54 1994 Message-Id: <address@hidden> Full-Name: Russ Rew To: Chris Anderson <address@hidden> Subject: Re: NetCDF & XDR on Cray In-Reply-To: Your message of "Mon, 02 May 1994 11:27:45 PDT." <address@hidden> Organization: UCAR Unidata Program Date: Mon, 02 May 1994 15:18:54 -0600 From: Russ Rew <address@hidden> > Organization: The National Energy Research Supercomputer Center (NERSC) > Keywords: 199405021827.AA20955 Hi Chris, > Hello Russ, I was at the the Dept. of Energy Computer Graphics forum last > week. I talked with some people from Los Alamos and Sandia laboratories > regarding XDR performance on the Crays, and it seems that there is more > than enough interest in our providing some effort to boost the performance > of NetCDF on the Cray. I thought that I should try to guage your interest > before embarking on this journey. > > From the minimal contacts I have had with the Cray people, they understand > that the XDR routines aren't optimal, but don't seem too interested in > doing anything about it. On the other hand, with the Cray T3D coming along, > the issue of going from Cray representation into IEEE may become more > important to them (as the T3D uses Cray YMP & DEC Alpha Processors). As both > NERSC and NCAR will be getting a T3D perhaps we have a renewed opportunity > for getting Cray to do something (though I wont hold my breath ;-) > > In the meantime, Cray has other routines for doing IEEE format > conversions. We at NERSC are willing to try and make use of these (more > optimized) routines in NetCDF, if you feel that the UCAR people will > consider making their use part of the distribution (we won't need to > distribute the routines, just make "ifdef'ed" calls to them for the > CRAY). Looking forward to hearing from you. --Chris We would certainly consider making optimized versions of the XDR routines for Cray platforms available as part of the distribution. Several users have asked us about such an optimization, and Cray Computer has corresponded with us about their improvements to xdr_array() to get efficient conversion for floating-point arrays on a Cray-3. They claim that their changes for the Cray-3 would not do any good on Cray Labs platforms from CRI, so we might have a problem trying to support both sets of modifications. I've appended some email correspondence I had with people from Cray Computer on this issue ... __________________________________________________________________________ Russ Rew UCAR Unidata Program address@hidden P.O. Box 3000 (303)497-8645 Boulder, Colorado 80307-3000 To: address@hidden (Dave Resch) support-netcdf Subject: Re: netcdf install on Cray3 Organization: UCAR Unidata Program Date: Mon, 11 Apr 1994 15:13:35 -0600 From: Russ Rew <russ@buddy> > Organization: Cray > Keywords: 199404082139.AA08286 Hi Dave, > The netcdf software has been installed on the Cray3 machine at NCAR > (graywolf). Following are a list of modifications that were needed to get > the software to install and run correctly on the Cray3: > > 1) Edit the CUSTOMIZE file in the netcdf-2.3.2 directory: > CC=/bin/cc > FC=/bin/f77 > FFLAGS= > prefix=/usr/local > # > # Specify the operating system > OS=csos > # > # Ensure that the "native" xdr header files and libraries are used > # > CPP_XDR="/usr/include/rpc" > LD_XDR="-l /usr/lib/libnet.a -l /usr/lib/librpc.a" > > 2) Create an operating system file in the netcdf-2.3.2/fortran directory > named: > csos.m4 > > This is simply the unicos.m4 file with the following change: > < define(`M4__SYSTEM',CSOS) > --- > > define(`M4__SYSTEM', UNICOS) > > 3) Edit the configure script to recognize the LD_XDR definition from above by > commenting out the line which redefines LD_XDR to be nothing: > < # *) LD_XDR=;; > --- > > *) LD_XDR=;; > > 4) Edit the file netcdf-2.3.2/libsrc/netcdf.h to correctly define FILL_BYTE > for the CCC compilers: > < #define FILL_BYTE ((signed char)-127) /* Largest Negative value > */ > --- > > #define FILL_BYTE ((char)-127) /* Largest Negative value > */ > > > We (Cray Computer) would like to know if Unidata is willing to add the > above support to the netcdf install kit so that netcdf correctly installs > on a Cray3 machine? Yes, the changes look small enough that I would have no problem adding them to the source distribution. I would like to be able to identify the platform at run-time so I can integrate these changes into our auto-configure package. For that, I need to know what predefined constant is available for specifying this platform. For example, the UNICOS Cray C compiler predefines the macro "_UNICOS", so I can test on it with statements like #ifdef _UNICOS ... #endif Is there a similar predefined macro for CSOS? > Additionally, we have developed some very efficient, vectorized Cray <==> > IEEE floating point conversion routines. We will be modifying our > xdr_array() implementaion to use these new conversion routines and would > also like to modify some of the netcdf-2.3.2/libsrc modules to use them. > > Basically, a single call would be made to convert an arbitrary number of > values rather than converting a single value at a time from within a loop > as is currently done. The changes should be very minor. Again, we would > like to know if Unidata is willing to add this support (via conditional > compilation directives) to the netcdf sources? We have had several requests for such an optimization, since currently this is the main bottleneck in using the netcdf library on Cray platforms. I would be willing to add this support, via conditional compilation directives. This would be especially useful if it worked for all Cray platforms, not just the Cray 3. __________________________________________________________________________ Russ Rew UCAR Unidata Program address@hidden P.O. Box 3000 (303)497-8645 Boulder, Colorado 80307-3000 To: address@hidden (Steve Gombosi) Subject: Re: netcdf install on Cray3 Organization: UCAR Unidata Program Date: Mon, 11 Apr 1994 17:01:25 -0600 From: Russ Rew <russ@buddy> Steve, > >We have had several requests for such an optimization, since currently this > >is the main bottleneck in using the netcdf library on Cray platforms. I > >would be willing to add this support, via conditional compilation > >directives. This would be especially useful if it worked for all Cray > >platforms, not just the Cray 3. > > The optimization in question is for array.c to call xdr_vector once rather > than issue multiple calls to xdr_float or whatever. This produces a > significant speedup because we are in the process of modifying xdr_vector > to use vectorized conversion routines which we (Cray Computer) have > written for the Cray-3. > > The version of xdr_vector supplied by Cray Research is not optimized > in this way. There would be no benefit to making this modification to > array.c on CRI machines - in fact, it would produce a slight loss of > performance on those systems (the results should still be correct, however). > > The speedup in data conversion is quite substantial. For an XDR stream > opened to memory (via xdrmem_create()), the conversion is asymptotically > about 100 times faster than using individual calls to xdr_float(). For > an XDR stream opened to a file (via xdrstdio_create()), which would be the > most common case for netcdf, the asymptotic speedup factor appears to be in > the neighborhood of 200 times. These timings are based on actual runs > on the NCAR Cray-3 - the simulator probably would have produced more > consistent measurements, but the old code takes so long to run under the > simulator that it's not possible to do large amounts of data. The following > is a measurement of the performance of the conversion loop from array.c > contrasted with a single, equivalent call to the new xdr_vector routine: > > XDR opened with xdrstdio_create() > > Size Clocks(old) Clocks(new) Speedup(old/new) > 1 12962 9568 1.354724 > 2 7104 5328 1.333333 ... > 131072 432382624 2265124 190.8870 > 262144 864768712 4518792 191.3717 > 524288 1729798122 9026452 191.6366 > > Optimized Integer conversions are not yet in place but should > be available sometime this week. > > How large are the arrays which are typically written by array.c? I don't know. I imagine they are all over the map, but might be expected to be somewhat larger on Crays than on workstations. In any case, this looks like a worthwhile improvement, even for small vectors. I don't know much about the Cray 3 or about how it compares with CRI Crays. Is there an on-line document I could read to learn more about it or about Cray Computers? For example, I'm curious if you're actually using a different representation for floating point numbers, since you evidently wrote a new XDR library. --Russ To: davis Subject: [address@hidden (Steve Gombosi): Re: netcdf install on Cray3] Date: Tue, 12 Apr 1994 08:28:36 -0600 From: Russ Rew <russ@buddy> Glenn, Just for your information, this corrects a misstatement I made yesterday about the floating point representations being different on Cray 3's and other Crays. --Russ ------- Forwarded Message Date: Mon, 11 Apr 94 17:36:19 MDT From: address@hidden (Steve Gombosi) To: address@hidden, address@hidden Subject: Re: netcdf install on Cray3 >I don't know much about the Cray 3 or about how it compares with CRI Crays. >Is there an on-line document I could read to learn more about it or about >Cray Computers? There are several man pages on the Cray-3. If you'd like, I could email copies of them to you. We could probably arrange to get you a hardware manual, as well. >For example, I'm curious if you're actually using a >different representation for floating point numbers, since you evidently >wrote a new XDR library. No, the floating point representation is identical to the CRI machines. We are using essentially the same library we received from CRI when the company split up. While helping Dave in his efforts to get the netcdf port working, I noticed that there was significant room for improvement in the code and decided to see what some small changes would yield. Unfortunately, the entire design of the XDR library is oriented toward small, scalar machines. It seems that no one at CRI bothered to make any attempt at an efficient implementation on a large, vector architecture. Steve ------- End of Forwarded Message .