[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20050307: LDM status on papagayo
- Subject: 20050307: LDM status on papagayo
- Date: Mon, 07 Mar 2005 01:17:43 -0700
>From: Unidata User Support <address@hidden>
>Organization: Unidata Program Center/UCAR
>Keywords: 200503070449.j274n2v2001291 IDD LDM pqcat
Hi Clint,
After getting the attached email from Pete Pokrandt tonight, I logged
onto papagayo and found that the LDM was not running. Since the queue
was corrupted by an apparent reboot yesterday, I deleted, remade it,
and then restarted the LDM. Please see below for details.
I notice that the reason that the LDM did not come up after the reboot
yesterday was that the queue check action:
pqcat -s -l /dev/null
was and still is hung, and is chewing up CPU cycles:
load averages: 4.20, 4.27, 3.04 02:10:05
131 processes: 127 sleeping, 1 zombie, 3 on cpu
CPU states: % idle, % user, % kernel, % iowait, % swap
Memory: 4096M real, 2651M free, 483M swap in use, 4743M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
936 root 1 10 0 0K 0K cpu/0 33.4H 21.63% pqcat
...
I am unable to kill it since it is being run by 'root'. Please
kill this as soon as you get a chance.
I noticed that the LDM queue is too small (400M) to hold even a small
fraction of an hour's worth of data given the volume that papagayo is
ingesting. Since you have enough space in /data, I took the liberty of
increasing the queue size to 2GB when I restarted the LDM:
<as 'ldm'>
cd ~ldm/etc
-- edit ldmadmin-pl.conf and change $pq_size from "400M" to "2G"
cd ~ldm
ldmadmin delqueue
ldmadmin mkqueue -f
ldmadmin start
Here is the message Pete sent earlier tonight:
From address@hidden Sun Mar 6 21:49:02 2005
All,
We feed NIMAGE data from papagayo.unl.edu (I don't have the
contact for them available..) We haven't seen any NIMAGE data
since about 22 UTC Friday March 5. I just noticed now, and
flipped over to feed from idd.unidata.ucar.edu (to f5.aos.wisc.edu)
until we figure out what's up.
Pete
Cheers,
Tom
--