[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20050419: some modifications on bigbird
- Subject: 20050419: some modifications on bigbird
- Date: Tue, 19 Apr 2005 17:21:10 -0600
>From: Unidata User Support <address@hidden>
>Organization: Unidata Program Center/UCAR
>Keywords: IDD ingest decode relay
Hi Gerry,
After seeing the continuous high load averages on bigbird this morning,
I decided to see if I could help mitigate and better monitor the situation:
monitoring:
- added my id_dsa.pub key to the ~/.ssh/authorized_keys file so I can
snarf the last line in the ~ldm/logs/bigbird.uptime log file. This
information is used (by me) to keep tabs on performance of an increasing
number of machines around the IDD. Here is some sample output of
the script's execution:
node CCYYMMDD.HHMM 1min 5min 15min feed ing tot age mfree swpused
--------+-------------+------+-----+-----+----+---+---+------+-----+-------
uni1 20050419.2304 0.36 0.30 0.42 1 2 3 9080 8M 52M
uni2 20050419.2306 5.39 6.76 6.96 69 2 71 8852 9M 83M
uni3 20050419.2305 0.74 1.79 2.71 77 2 79 8203 15M 77M
uni4 20050419.2306 0.29 0.91 0.88 44 2 46 8862 15M 76M
thelma 20050419.2304 0.07 0.09 0.08 1 2 3 4184 20M 80M
jackie 20050419.2306 0.23 0.29 0.22 2 4 6 5050 11M 46M
desi 20050419.2306 0.12 0.17 0.20 2 4 6 2476 346M 1664M
samoon 20050419.2303 0.24 0.18 0.12 1 4 5 1188 1M 327M
igor 20050419.2306 0.02 0.07 0.07 1 4 5 5049 12M 38M
emo 20050419.2306 0.14 0.26 0.37 6 12 18 2403 1557M 1024M
oliver 20050419.2306 0.08 0.15 0.16 5 10 15 4278 14M 97M
mother 20050419.2306 4.92 4.96 5.18 4 15 19 2427 543M 1923M
atm 20050419.2306 2.43 2.67 2.73 69 10 79 4103 918M 445M
unidata2 20050419.2306 0.47 0.65 1.12 38 5 43 2049 1294M 245M
papagayo 20050419.2306 1.68 1.73 1.88 37 14 51 3120 740M 694M
bigbird 20050419.2306 7.28 10.70 12.02 32 14 46 1636 5M 24K
mitigation:
- seeing that bigbird was struggling with the ingestion, decoding, and
relaying functions that it was performing, and knowing that you are
not actively using the files being decoded into McIDAS format using
McIDAS-XCD decoders, I decided to stop running the decoders to see
if that would help
- I then cleaned up orphaned shared memory segments left over from
incorrectly exited McIDAS processes -- this freed up some, but not
a lot of, memory
- I killed an invocation of scourByDay.tcl since it had been racking
up lots of CPU time
After making these changes, the load average on bigbird dropped from
the 19-25 range back down to the 5-17 range. How much of the drop has
been due to catching up on CONDUIT, CRAFT, and NNEXRAD processing is
unknown, but I have the sneaking suspicion that killing off the wayward
scourBYday.tcl script helped a bunch. It also didn't hurt turning off
the XCD processes since they were using I/O and I/O was the limiting
factor in the slowdown.
Mike Schmidt and I were discussing the sluggishness of bigbird, and we
agreed that you might well benefit from an upgrade to Fedora Core 3.
Just wanted to let you know...
Cheers,
Tom