[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20030920: mesodata load average
- Subject: 20030920: mesodata load average
- Date: Sat, 20 Sep 2003 10:14:46 -0600
>From: Gerry Creager N5JXS <address@hidden>
>Organization: Texas A&M University -- AATLT
>Keywords: 200309201339.h8KDdPk1012811
Hi Gerry,
>Yesterday afternoon, when I logged in, it looked like things had calmed
>down.
I did a couple of things on mesodata yesterday to try and calm things:
- setup the runtime links to all point at ldm-6.0.14; this was more
housekeeping than something that would calm things
- commented out the the ~ldm/scour.conf entry that was scouring the
~ldm/data/gempak/nexrad directories. I FTPed down a script
designed to scour NEXRAD data, prune_nexrad.csh, and set it
up to keep about a day's worth (about a day since prune_nexrad.csh
is setup to keep a certain number of images, and I set it up to
keep 288 files which is 1 day of images when the radar is operating
in storm mode)
- changed the LDM scour to run less than every 2 hours; I noticed
that the thing using up the most of the machine was multiple
invocations of the scouring script, and, since scouring hits the disk
_hard_, it is better to run it as little as possible
- cleaned out the ~ldm/logs directory of .stats files (produced by
running pqbinstats; there were 2462 of these files there); these
files were not getting sent to Unidata since there was no crontab
entry to do the send and remove. I added the entry that runs
'bin/ldmadmin dostats' at 35 past the hour.
- cleaned out the ~ldm/data/nexrad/NIDS directory since you are now
FILEing NEXRAD images in the ~ldm/data/gempak/nexrad/NIDS directory.
This freed up over a GB of disk (there were 122000+ files there)
I noticed that the ~ldm/data/ddplus directory is huge as are a number
of other directories under ~ldm/data:
$ cd ~ldm/data
$ du -sk *
19660 AR
2141676 ARCHIVES
79456 binex
408 combhourly_pwv
0 cronlog
34389440 ddplus
4 decoded
124176 difax
4 fcst
4 forecasts
...
If -- and I didn't have the time to determine this -- scouring is
attempted in any of these directories, your system will slow to a
crawl. I ran out of time yesterday afternoon so I havn't determined if
attempted scouring in any of these directories is what is causing your
problems.
>Last night, radar was flowing nicely. And, of course, this
>morning, the load avg was back around 15.
Yesterday the load average was hovering at around 13-15. I found 5
invocations of LDM's scour running (5 instances of 'find'). It was
after I stopped the LDM and killed those scour invocations that the
load average went back down to less than 1.
>I've cut scour back to 0100 local, once per day. I'm still trying to
>find somenthing causing the load to shoot up.
This is a good step, and, now that the NEXRAD directories are being
scoured by a different script, it should be all that you need.
>I may have to revamp some of the parsing and db stuff to fix this.
I don't think that this is the problem, but I havn't had enough time
to really look at things in enough detail to know.
>Any thoughts?
The other thing I saw was that your GEMPAK decoding was not setup
exactly as Chiz recommends. I would like to revamp this setup so
that future GEMPAK upgrades can be done without a lot of thinking.
A standard installation would also allow the GEMPAK utility to
rotate GEMPAK log files. At least one of them that I renamed yesteday
was 2 GB in size, and, since that is the maximum file size, it was
no longer being written into.
>Thanks, gerry
Got to run...
Tom
--
+-----------------------------------------------------------------------------+
* Tom Yoksas UCAR Unidata Program *
* (303) 497-8642 (last resort) P.O. Box 3000 *
* address@hidden Boulder, CO 80307 *
* Unidata WWW Service http://www.unidata.ucar.edu/*
+-----------------------------------------------------------------------------+