[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20021018: Proftomd hanging on RedHat 8.0 (cont.)
- Subject: 20021018: Proftomd hanging on RedHat 8.0 (cont.)
- Date: Fri, 18 Oct 2002 12:32:57 -0600
>From: Gilbert Sebenste <address@hidden>
>Organization: NIU
>Keywords: 200210050242.g952g0127088 ldm-mcidas proftomd
Gilbert,
re: setting a data monitor inactive
>OK. I didn't know if you wanted that done, since I thought you were trying
>to see what was up with it.
The test I am running is the following:
o I uncommented the startup of XCD routines in ~ldm/ldmd.conf
o logged in as 'mcidas', I disabled the running of the synoptic/ship/buoy
decoder dmsyn.k
o I created the directory ~mcidas/workdata/test
o copied DECINFO.DAT from ~mcidas/workdata to ~mcidas/workdata/test
o changed MCPATH for 'mcidas' from the command line to add ~mcidas/workdat/test
to the front:
MCPATH=/home/mcidas/workdata/test:/home/mcidas/workdata:/home/mcidas/data:/home/mcidas/help
o cd to ~mcidas/workdata/test
o start a McIDAS enviornment:
mcenv
o in this environment, I turn on the synoptic/ship/buoy decoder:
decinfo.k SET DMSYN ACTIVE
This does not affect the copy of DECINFO.DAT that is used by the XCD
supervisory routine startxcd.k (that is started upon LDM startup
form the 'exec xcd_run MONITOR' invocation in ~ldm/etc/ldmd.conf)
At this point, I can run the synoptic/ship/buoy decoder by hand. In order
to setup an environment in which I can cause a core file to be dumped
(McIDAS turns off creation of core files by default), I have to do
a couple of things within the McIDAS environment I created with mcenv:
ucu.k POKE 142 0 <- tell McIDAS to enable core dumps
unlimit coredumpsize <- tell Linux to enable core dumps
Now, I can run the decoder by hand AND cause a core file to be dumped
if/when it goes into its infinite loop:
dmsyn.k RESTART=-1 DEV=CCC
Phew!
re: how to see which XCD data monitors are active
>Yep. OK...any ideas?
Not yet. I am hopeful that the copy of dmsyn.k that I created with
the '-g' flag set for compilation (of m0syndec.for, m0shpdec.for, and
dmsyn.pgm) will provide a core dump that will tell me where the decoder
gets into an infinite loop. Once I have that information, I can examine
the code and see what needs to be bulletproofed.
>The new kernel is in. Oh, interestingly, it is NOT
>doing it on weather.admin.
Very weird given that both weather and weather2 are both running RH 8.0!
I see that you commented out the execution of proftomd on weather. Does
this mean that it was hanging there also?
>I betcha RedHat comes out with a new Glibc
>soon...customers are pretty ticked off. Let's see if that fixes it.
The problem with proftomd really does seem to be related to one of the
glibc shared libraries. The reason I can say this is that you were
using a binary version of proftomd built on RH 7.1. That version of
proftomd is running on several RH 7.[0123] systems with no problems.
Also, where the program goes into an infinite loop is outside of any
particular call. The only thing I did (the hack/kludge) was to have it
not try to update the McIDAS routing table with the information that a
new set of data had been received and decoded.
I ran strace on proftomd but nothing was revealed. I examined proftomd
routines to make sure that no arrays were being overflowed, or pointers
blown -- nothing. The kludge was only made to get things working.
Tom