[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
19990701: The CONDUIT data feed
- Subject: 19990701: The CONDUIT data feed
- Date: Thu, 01 Jul 1999 17:19:17 -0600
Celia,
When the data queue is created, the system allocates "full"
pages of memory. Thus the queue size created will be slightly larger
than what is specified in ldmadmin because loading full pages
of memory is more efficient for the operating system.
If the NMC2 data feed is the only thing you are receiving,
you should be able to handle the data feed with a 800MB
queue.
If the queue is growing larger than 800MB, then either pqexpire has
died, or you have changed the invocation of pqexpire so that it is
running less often or keeping data longer than 1 hour.
My settings on flip which is serving the NMC2 feed:
in ldmadmin: $pq_size = 800000000;
On disk:
[694]chiz on flip --> ls -l ldm.pq
-rw-r--r-- 1 ldm ustaff 828162048 Jul 1 17:00 ldm.pq
The high water mark in the queue since June 14, 640Mb (from kill -USR1):
Jul 01 23:14:02 5Q:flip pqexpire[1145357]: > Queue usage (bytes):641124344
In ldmd.conf, pqexpire is being launched with:
exec "pqexpire -i 1200"
Flip is running on a SGI Octane with irix 6.5.4m:
[702]chiz on flip --> uname -aR
IRIX64 flip 6.5 6.5.4m 04151556 IP30
Steve Chiswell
Unidata User SUpport
>From: address@hidden (Celia Chen)
>Organization: .
>Keywords: 199907012259.QAA28988
>Steve,
>
>I restarted LDM this morning and remake the queue
>which is set to 1.1GB and the ldm.pq was 1,125,810,176.
>
>I just reset the pq_size to 1.2GB and the ldm.pq is
>1,228,161,024 now.
>
>NMC2 is the only data iita2 requests. Why does it need
>such a large pq_size and why is iita2's ldm.pq always
>larger than the set pq_size?
>
>Thanks.
>
>Celia
>
>P.S. Yes, I do see "RECLASS" in the ldmd.log files from the last
>few days.
>
>>
>>
>> Celia,
>>
>> The "not enough space" message seems to indicate that the LDM
>> tried to increase the queue size, and failed. In general, you
>> should start the queue as large as you expect to need it.
>> On IRIX, the LDM will try to increase the queue size if needed-
>> but you can run into conflicts with pqexpire. This can corrupt the queue
>> and kill the LDM.
>>
>> If pqexpire dies, then the queue would not be scoured and would
>> keep growing out of control. I run my IRIX64 machine
>> queue size with ldmadmin specifying 800mb. My statistics show
>> that the high water mark of this queue is currently about 625Mb.
>> The queue has been serving the NMC2 feed for many months without
>> rebuilding.
>>
>> If you have shorter than normal MRF files, then check the LDM
>> logs for RECLASS messages. This would be a sign of latencies
>> greater than 1 hour and losing data. The other common occurence
>> is for the Cray at NCEP to crach and files are late or scrubbed.
>>
>> Steve Chiswell
>> Unidta User Support.
>>
>>
>>
>>
>> >From: address@hidden (Celia Chen)
>> >Organization: .
>> >Keywords: 199907011817.MAA21465
>>
>> >I started receving the MRF grid #3 data on iita2.rap.ucar.edu
>> >on 6/24/99 and feeding WITI on 6/25/99. It looks like
>> >the data was coming in normally for a few days. We have just
>> >noticed that some data files camee in on 6/28 and 6/29 are much
>> >smaller than normal size. Then iita2 stopped saving data on the
>> >disk during 6/29 while WITI was able to continue archiving data
>> >until today. (See below)
>> >
>> >------------------
>> >/iita/data/ldm/mrf
>> >
>> >-rw-rw-r-- 1 ldm ldm 23534364 Jun 27 02:33 99062700132_PGrbF.m
> rf
>> >-rw-rw-r-- 1 ldm ldm 23374262 Jun 27 02:36 99062700144_PGrbF.m
> rf
>> >-rw-rw-r-- 1 ldm ldm 23473400 Jun 27 02:39 99062700156_PGrbF.m
> rf
>> >-rw-rw-r-- 1 ldm ldm 23492248 Jun 27 02:42 99062700168_PGrbF.m
> rf
>> >-rw-rw-r-- 1 ldm ldm 23232058 Jun 27 01:38 9906270024_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23332132 Jun 27 01:43 9906270036_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23315028 Jun 27 01:47 9906270048_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23323330 Jun 27 01:57 9906270060_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23264214 Jun 27 02:02 9906270072_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23431280 Jun 27 02:16 9906270084_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23394558 Jun 27 02:22 9906270096_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 19546092 Jun 28 01:50 9906280000_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9342436 Jun 28 02:02 99062800108_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9352474 Jun 28 02:07 99062800120_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23330756 Jun 28 01:54 9906280012_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9368596 Jun 28 02:10 99062800132_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9283910 Jun 28 02:13 99062800144_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9398850 Jun 28 02:14 99062800156_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 11024120 Jun 28 02:58 99062800168_PGrbF.m
> rf
>> >-rw-rw-r-- 1 ldm ldm 23236234 Jun 28 01:58 9906280024_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 21756684 Jun 28 02:02 9906280036_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 23275368 Jun 28 02:07 9906280048_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9348664 Jun 28 01:40 9906280060_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 9340868 Jun 28 01:45 9906280072_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 9365856 Jun 28 01:56 9906280084_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 9324932 Jun 28 01:58 9906280096_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 8163116 Jun 29 01:26 9906290000_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 9327876 Jun 29 01:29 9906290012_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 15553538 Jun 29 01:53 9906290024_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 17078040 Jun 29 01:55 9906290036_PGrbF.mr
> f
>> >-rw-rw-r-- 1 ldm ldm 9364992 Jun 29 01:42 9906290048_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 9364588 Jun 29 01:53 9906290060_PGrbF.mrf
>> >-rw-rw-r-- 1 ldm ldm 5715248 Jun 29 01:55 9906290072_PGrbF.mrf
>> >------------------
>> >
>> >There is this "Not enough space" message on pqact.log:
>> >
>> >------------------
>> Jun 24 22:35:54 pqact[2235]: Starting Up
>> >Jun 29 07:57:18 pqact[2235]: mmap: 18040000 0 1744732160: Not enough space
>> >Jun 29 07:57:18 pqact[2235]: Remap failed. Abandon all hope.
>> >Jun 29 07:57:18 pqact[2235]: pq_sequence failed: Not enough space (errno =
> 12)
>> >Jun 29 07:57:18 pqact[2235]: Exiting
>> >------------------
>> >
>> >It looks like there is enough disk space to store the MRF data on
>> >iita2 at this point:
>> >
>> >-----------
>> >iita2|22|% df /iita
>> >Filesystem Type blocks use avail %use Mounted on
>> >/dev/dsk/xlv/xlviita xfs 14163224 11989424 2173800 85 /iita
>> >
>> >-----------
>> >What could be the cause of the problems we see here? Please advise.
>> >
>> >Thanks.
>> >
>> >Celia
>> >~
>> >-
>> >
>>
>> ****************************************************************************
>> Unidata User Support UCAR Unidata Program
>> (303)497-8644 P.O. Box 3000
>> address@hidden Boulder, CO 80307
>> ----------------------------------------------------------------------------
>> Unidata WWW Service http://www.unidata.ucar.edu/
>> ****************************************************************************
>>
>