This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Justin, >Date: Fri, 16 Sep 2005 09:48:18 -0400 >From: Justin Cooke <address@hidden> >Organization: NOAA >To: address@hidden >Subject: Re: "pbuf_flush: time elapsed" problem The above message contained the following: > I added the logging to the script that we are seeing the most > "pbuf_flush Timed out" from. Yesterday we also broke apart the pqact > processing, previously we only had one pqact for all incoming data. We > created a second entry in ldmd.conf for just CRAFT data and the existing > entry was modified to act on all data type except CRAFT but we still saw > the same errors overnight. Below is another log entry: > > --- > Sep 16 03:02:21 b2n1 eldm4[585824]: 7504142 20050916030208.060 FSL5 > 000 FSL.CompressedNetCDF.MADIS.mesonet4.20050916_0100.gz > Sep 16 03:02:21 b2n1 pqact[909556]: 7504142 20050916030208.060 FSL5 > 000 FSL.CompressedNetCDF.MADIS.mesonet4.20050916_0100.gz > Sep 16 03:02:21 b2n1 pqact[909556]: pipe: -close > /usr/ldm/ncep_bin/run_dcmeso.sh > /dcomdev/us007003/ldmdata/obs/surface/mesonet 20050916_0100.gz > Sep 16 03:04:21 b2n1 pqact[909556]: pbuf_flush 2: time elapsed 120.000055 > Sep 16 03:04:21 b2n1 pqact[909556]: pbuf_flush (2) Timed out > Sep 16 03:04:21 b2n1 pqact[909556]: pipe_put: > -close/usr/ldm/ncep_bin/run_dcmeso.sh/dcomdev/us007003/ldmdata/obs/surface/mesonet20050916_0100.gz > > write error > --- > > But there is nothing in the dcmeso.log file related to this time: > > --- > END ---- PID= 1151078 ---- Thu Sep 15 23:02:08 EDT 2005 > ---------------------------- > exit 0 > + exit 0 > echo "START -- PID=" $$ " ----" `date` "----------------------------" > + date > + echo START -- PID= 684064 ---- Thu Sep 15 23:04:21 EDT 2005 > ---------------------------- > START -- PID= 684064 ---- Thu Sep 15 23:04:21 EDT 2005 > ---------------------------- > > > > > gzip -d > ${1}/$$.${2} > + gzip -d > + 1> /dcomdev/us007003/ldmdata/obs/surface/mesonet/684064.20050916_0200.gz > madisfilename=${1}/`echo ${2} | cut -c1-13` > ---- > > The END time of the previous run_dcmeso.sh instance is 03:02:08Z and the > next doesn't start until 03:04:21Z which is right after the "Timed out" > error but the file being acted on is 20050916_0200.gz where the "Timed > out" was for 20050916_0100.gz. The "START" line is never added when the > the pipe: -close for 03:02:21 is run, it's as if the log is never > touched, looking like the script never ran. > > any ideas? Not yet. I'm going to ponder this today. Incidentally, you might want to invoke the newlog(1) utility via crontab(1) to keep the logfile "dcmeso.log" from growing indefinitely. > We are running this version of LDM on AIX 5.2, is there any other system > information that you would find useful? I'll let you know. > I have attached run_dcmeso.sh, it is slightly different from the script > that I was talking about yesterday but it is doing the same major steps, > unzipping a file and sending it off to a decoder. > > Thanks for the help, > > Justin [snip] > #!/bin/sh -vx > # > # This script is EXECed directly by DBNet in order to run the > # dcmeso decoder on the data file given in the first argument. > # > # Usage: ./run_dcmeso.sh <mesonet_filename> > # > # Once this is done, the data file itself is then compressed > # within its native directory for more efficient short-term > # storage. > # > > # The gzip line must be the first, noncomment line in this script > # so that stdin is processed correctly > exec >> /usr/ldm/logs/dcmeso.log 2>&1 > echo "START -- PID=" $$ " ----" `date` "----------------------------" > > gzip -d > ${1}/$$.${2} > madisfilename=${1}/`echo ${2} | cut -c1-13` > hhmm=`date -u +%H%M` > decoderfilename=${madisfilename}.${hhmm} > mv ${1}/$$.${2} ${decoderfilename} > > . /ioddev/dbndev/.profile > > #export MADIS_STATIC=/nwprod/lib/sorc/madis/static > export MADIS_STATIC=$DCDROOT/lib/sorc/madis-2.5/static > export MADIS_DATA=/dcomdev/us007003/ldmdata > > ln -sf ${decoderfilename} ${madisfilename} > > nice $DCDROOT/bin/decod_dcmeso -v 2 -d > /dcomdev/us007003/decoder_logs/decod_dcmeso.log ${decoderfilename} > /dcomdev/us007003/bufrtab.255 > > rm -f ${madisfilename} > > # > # Compress the decoder input file within its native directory, > # in order to conserve disk space for these large files! > # > > gzip ${decoderfilename} > > # > # Explicitly set the script return code to 0, in order to prevent > # the "compress" return code from becoming the script return code > # (and thereby prevent DBNet from re-running the script for this > # particular data file if there is a problem with the compress!) > # > > echo "END ---- PID=" $$ " ----" `date` "----------------------------" > exit 0 Regards, Steve Emmerson