[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20021213: McIDAS on weather.admin.niu.edu
- Subject: 20021213: McIDAS on weather.admin.niu.edu
- Date: Wed, 18 Dec 2002 14:36:27 -0700
>From: Gilbert Sebenste <address@hidden>
>Organization: NIU
>Keywords: 200212131543.gBDFhG410430 McIDAS
Gilbert,
>Is it possible that you could put the McIDAS memory leak patches onto
>weather.admin? My machine has gotten very slow in the last severla
>months...and I'm wondering if that is the problem.
I logged onto weather today and don't see that the McIDAS-XCD data
monitors are using up excessive amounts of either memory or CPU. In
fact, the processes that seem to be the big hogs are:
rad
X
nautalis
rhn-applet-gui
gnome-panel
>And yes, go ahead and
>put the latest version of McIDAS on there to match weather2, if need be,
>and if you have time.
If I can see a clear indication that McIDAS has something to do with
the problems you are seeing, I will do the upgrade.
>From address@hidden Wed Dec 18 09:37:44 2002
>I throw my hands up. I do not know what is causing this. On
>weather.admin.niu.edu, I'm running RH 8.0, Gcc 3.2-11, latest version of
>glibc. The machine has 1 GB of memory, but after an hour or less after
>rebooting, it is full and then starts using disk swap all the time,
>causing the machine to bog down severely.
We (me and our system administrator, Mike Schmidt) observed this after
logging on today. A quick look (using top) to see what the big memory
users are shows:
2:27pm up 2:31, 3 users, load average: 8.21, 6.22, 5.10
128 processes: 123 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 64.1% user, 19.4% system, 0.0% nice, 15.3% idle
CPU1 states: 58.0% user, 13.4% system, 0.0% nice, 27.4% idle
Mem: 1030548K av, 1014280K used, 16268K free, 0K shrd, 20896K buff
Swap: 256996K av, 580K used, 256416K free 799736K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
16247 ldm 15 0 17584 17M 628 S 5.3 1.7 0:41 pqact
979 ldm 15 0 15528 15M 9908 S 0.0 1.5 0:04 nautilus
985 ldm 15 0 13748 13M 9364 S 0.0 1.3 0:13 rhn-applet-gui
977 ldm 16 0 13452 13M 8164 S 3.5 1.3 1:17 gnome-panel
995 ldm 15 0 8608 8604 6744 S 0.1 0.8 0:18 gnome-terminal
947 ldm 15 0 8416 8412 6512 S 0.0 0.8 0:00 gnome-session
960 ldm 15 0 7188 7184 5788 S 0.0 0.6 0:01 gnome-settings-
958 ldm 15 0 6232 6232 5128 S 0.5 0.6 0:18 metacity
954 ldm 15 0 4980 4980 1984 S 0.0 0.4 0:00 gconfd-2
983 ldm 15 0 3996 3996 3440 S 0.0 0.3 0:00 pam-panel-icon
956 ldm 15 0 2244 2244 1840 S 0.0 0.2 0:00 bonobo-activati
12696 ldm 15 0 1912 1912 1036 S 0.0 0.1 0:00 tcsh
974 ldm 15 0 1712 1712 1408 S 0.0 0.1 0:02 xscreensaver
996 ldm 15 0 1632 1632 812 S 0.0 0.1 0:00 tcsh
16263 ldm 15 0 1616 1616 712 D 1.1 0.1 0:51 dmsfc.k
20188 ldm 25 0 1540 1540 1268 R 49.0 0.1 0:02 rad
16244 ldm 15 0 1392 1392 1300 S 7.2 0.1 0:53 rpc.ldmd
964 ldm 15 0 1364 1364 1064 S 0.0 0.1 0:00 fam
16266 ldm 15 0 1044 1044 744 S 0.0 0.1 0:00 dmmisc.k
18744 ldm 15 0 1012 1012 752 R 1.7 0.0 0:10 top
16264 ldm 15 0 988 988 700 S 0.0 0.0 0:00 dmraob.k
948 ldm 15 0 988 968 792 S 0.0 0.0 0:00 ssh-agent
20043 ldm 18 0 964 964 816 S 0.7 0.0 0:00 dtnradscript
20145 ldm 17 0 952 952 816 S 0.0 0.0 0:00 doppler.srmv1
888 ldm 18 0 936 936 644 S 0.0 0.0 0:00 tcsh
16246 ldm 19 0 928 924 860 S 0.0 0.0 0:00 startxcd.k
16280 ldm 16 0 920 916 844 S 0.0 0.0 0:00 ingetext.k
918 ldm 15 0 868 868 748 S 0.0 0.0 0:00 imwheel
16265 ldm 15 0 828 828 668 S 0.0 0.0 0:00 dmsyn.k
16249 ldm 15 0 732 732 608 S 0.3 0.0 0:04 pqact
16245 ldm 15 0 728 728 644 S 1.9 0.0 0:12 pqbinstats
16248 ldm 15 0 668 668 588 S 0.5 0.0 0:10 pqsurf
As you can see from this list, the McIDAS-XCD processes (dmsfc.k, dmsyn.k,
dmraob.k, dmmisc.k, ingebin.k, ingetext.k, and startxcd.k) are not
using much memory or CPU. So, the answer to the mystery lies elsewhere.
Just by chance, I did a netstat to see what machines were connected
to weather, and Mike noticed that the entry for the LDM connections
were showing the LDM port, not the mnemonic (i.e., 388 vs ldm). This
told us that you did not have the requisite LDM entries in your
/etc/services file. A quick look verified this.
OK, so hold on to your butt...
When I went to add the /etc/services entries for the LDM, the load average
on weather was right around 5. I added the entries (as per web page
instructions for setting up an LDM):
<all done as 'root'>
# Local services
ldm 388/tcp
ldm 388/udp
and then sent a HUP signal to xinetd:
% ps -eaf | grep xinetd
root 554 1 0 11:56 ? 00:00:00 xinetd -stayalive -reuse -pidfil
ldm 28663 12696 0 15:29 pts/2 00:00:00 grep -i xinetd
% kill -HUP 554
I then exited out as 'root' back to 'ldm' and ran top again. To my
amazement, the load average had dropped to less than 2! Why the load
average dropped is a bit of a mystery. Either something in your system
needed the /etc/services entries for the LDM, or the HUP to xinetd
freed up some system resource.
After watching things for awhile, I see that the CPU use as reported by
top continues to be relatively low (except when rad or X kicks in).
>I think part of it is McIDAS
>memory leaks which Tom Yoksas has been working on,
I disagree. The McIDAS-XCD processes stayed reasonably small, and their
CPU use is low.
>but I can't account for
>ALL of it. Can anyone venture a guess? I am running WXP and McIDAS, as
>well as the latest version of apache, all patched, on weather. The result:
>load average sticks between 5 and 10, data is missed all over the place,
>and I have no clue why.
So, something in RedHat 8 is causing the CPU load to sky rocket when
either xinetd gets bogged down for some reason, or the LDM entries
in /etc/services are vital.
By the way, we see a lot of connect/disconnects from weather.cod.edu
in your ~ldm/ldmd.log file. This sould be looked into.
Tom