[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 970605: netcdf and ffio on Cray
- Subject: Re: 970605: netcdf and ffio on Cray
- Date: Thu, 19 Jun 1997 15:59:14 -0600
>To: address@hidden
>From: Elizabeth Hayes <address@hidden>
>Subject: netcdf and ffio (fwd)
>Organization: Cray
>Keywords: 199706051836.MAA15954
Hi Elizabeth,
Sorry to have taken so long to respond to your support email, but it
seems to have "slipped through the cracks" during a busy period, and I
just noticed it had not been answered yet.
> I have received assistance from Steve Luzmoor
> on this problem, but wondered if y'all could
> also help, as Steve is on vacation for ten days.
Well, he's back now, so maybe he can provide help faster than we did :-).
One thing that would help us is to know what version of netCDF you are
using. Since we just released a new version 3.3.1 a couple of days ago,
and it fixes a few Cray problems (though just in the Fortran interface,
as far as I know), it's possible that might be a better version from
which to start. We were able to get 3.3.1 to pass the extensive test
suite executed when "make test" is invoked, using a T90 account at
GFDL. See the notes in the INSTALL file about this.
I'll forward this to Glenn Davis, in case he knows more about the
problem ...
> Steve suggestions were:
> 1) Isolate whether problem is ffio or not by
> changing setenv NETCDF_FFIOSPEC to cachea or buffa rather
> than the eie.sds.
> Result: No netcdf errors, normal completion, but still
> large locked i/o times.
> Conclusion: Problem doesn't occur when files are
> placed in buffa,cache, or cachea, but does occur
> using eie.mem or eie.sds as i/o layers.
>
> 2) set ncopts = NC_VERBOSE;
> this setting has already been set in the m3io layer
>
> 3) use debugger
> Result: Totalview gives different netcdf errors depending
> on where the breakpoints are set. Setting breakpoints
> in ncvarpt in jackets.c somehow interfers with the
> header information being written to the file, then results in
> error saying
> >>> WARNING in subroutine OPNFIL3 <<<
> Error opening file CLD_CRO_2D_G1
> EQ256:>> ./CLD_CRO_2D_G1
> netCDF error number 19
>
> I believe the file was created by the totalview executable
> but the header wasn't written out, so the program doesn't
> think it is a netcdf file.
>
> If I run in totalview without setting breakpoints I get
> the error listed below.
>
> Forwarded message:
> > From eah Tue Jun 3 15:45:56 1997
> > Subject: netcdf and ffio
> > To: luzmoor (Steve Luzmoor)
> > Date: Tue, 3 Jun 1997 15:45:57 -0500 (CDT)
> >
> > Hi Steve,
> >
> > Your name appeared in connection with netcdf 2.4, and ffio
> > optimization in the netcdf problem archive.
> >
> > I am working on a problem that was caused when the
> > /tmp disk on a T90 was changed from DD42 to ND40's.
> > The users were experiencing large locked i/o wait
> > time due to small read/write operations.
> >
> > setenv NETCDF_FFIOSPEC eie.sds.blocks.diag:184
> > The suggestions listed under the web document
> > http://www.gfdl.gov/~jps/txt/README.IO_Optimization.txt
> > did not work.
> >
> > setenv NETCDF_FFIOSPEC eie.sds.blocks.diag:184
> > INTEGER PUTENV
> > I = PUTENV('NETCDF_FFIOSPEC=eie.sds.blocks.diag:184')
> >
> > There errors I obtained were always similar to the following:
> > >>> WARNING in subroutine OPNFIL3 <<<
> > Error opening file MET_CRO_3D_G0
> > EQ256:>> /tmp/hayes_sesarm/maqsip/sesarm/input/SMRAQ_KF_mc3_g0
> > netCDF error number -1
> >
> >
> >
> > >>--->> WARNING in subroutine INCONVERT:INTERP3
> > Could not open MET_CRO_3D_G0
> > Date and time 13:00:00 July 7, 1995 (1995188:130000)
> >
> >
> > *** ERROR ABORT in subroutine INCONVERT
> > Could not interpolate DENS from MET_CRO_3D_G0
> > M3ERR: DTBUF 13:00:00 July 7, 1995
> > Date and time 13:00:00 July 7, 1995 (1995188:130000)
> >
> > The files would open, dimensions, variables and attributes
> > were written. When files were taken out of define mode,
> > and into data mode, and data written to the file,
> > the file seems to get corrupted.
> >
> > I tried using the latest eag_ffio library, and using the
> > following to specify which files were put on the sds.
> > setenv FF_IO_DEFAULTS
> > "eie.sds.blocks.diag.nolistio:184:-20mw:6:1,event.summary"
> > setenv FF_IO_OPTS "*mk1* (set.oflags_set+=2.skip|=0x200000 |event | eie)
> > *mk3* (
> > set.oflags_set+=2.skip|=0x200000 |event | eie) *ck3*
> > (set.oflags_set+=2.skip|=0x
> > 200000 |event | eie) *cm3* (set.oflags_set+=2.skip|=0x200000 |event | eie)
> > *cw2*
> > (set.oflags_set+=2.skip|=0x200000 |event | eie) *cd2*
> > (set.oflags_set+=2.skip|=
> > 0x200000 |event | eie)"
> > setenv FF_IO_OPEN_DIAGS 1
> > setenv FF_IO_LOGFILE
> > /flyer/cri/a/hayes/maqsip/sesarm/rel_KF/problem.ffio.part.c
> > c3
> >
> > I was able to restructure the code which made calls to the netcdf
> > library routines to get some of the files to work on the sds using ffio.
> > These routines are part of the EDSS/Models-3 Air Quality Modeling
> > System developed by Carlie Coats at MCNC for the EPA.
> >
> > Files that I have been able to put on the sds using ffio contained a few
> > variables
> > with one unlimited dimension, and other variables with fixed dimensions. I
> > added a call to set a nofill mode after defining all the dimensions,
> > attributes and
> > variables, and prior to taking the file out of define mode and putting into
> > data mode.
> > A count variable which had one unlimited dimension was then initialized
> > using ncvpt.
> >
> > Files made up of only multiple Variables with an unlimited dimension are
> > not affected
> > by setting the nofill option. Calls to synchronize the file to disk also
> > don't seem
> > to work.
> >
> > Would you help me try to find the source of the error? I am just learning
> > about netcdf, and ffio.
> >
> > Many Thanks, Liz Hayes
>
>
> Note: The files that I am able to put on the sds due to the modifications
> in the create file routines using the nofill mode significantly shortened the
> locked i/o wait time. I would appreciate any suggestions on how to make
> changes
> to the create file routines which contain only variables with an unlimited
> dimension.
--Russ
_____________________________________________________________________
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu