[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20020726: Help! LDM 5.2 keeps crashing on Linux RH 7.3!
- Subject: 20020726: Help! LDM 5.2 keeps crashing on Linux RH 7.3!
- Date: Fri, 26 Jul 2002 09:45:26 -0600
>From: Gilbert Sebenste <address@hidden>
>Organization: NIU
>Keywords: 200207261442.g6QEgK915724 LDM 5.2 RedHat 7.3 Linux
Gilbert,
>I need help with my ldm 5.2 on weather2.admin.niu.edu. I'm running it on a
>Dell Dimension P3, running RedHat Linux 7.3, all patches on (see
>http://www.redhat.com/errata/rh73-errata.html. My LDM mysteriously dies,
>without warning, with these messages.
If you havn't done so yet, I would try building 5.2 from source on
weather2. In fact, when I run into LDM problems with a binary
installation this is always the first thing I do.
>Note: Weather.admin and
>weather3.admin LDM's stay running fine (and both are on
>5.2!), though weather2 is the machine where almost everything is ingested:
>
>Jul 26 10:37:16 weather2 pnga2area[2898]: unPNG:: 1518538 2422256 1.5951
>Jul 26 10:37:16 weather2 pnga2area[2898]: Exiting
>Jul 26 10:39:53 weather2 rpc.ldmd[11879]: child 11883 terminated by signal 11
>Jul 26 10:39:53 weather2 rpc.ldmd[11879]: Killing (SIGINT) process group
>Jul 26 10:39:53 weather2 rpc.ldmd[11879]: Interrupt
>Jul 26 10:39:53 weather2 rpc.ldmd[11879]: Exiting
>Jul 26 10:39:53 weather2 pqbinstats[11880]: Interrupt
>Jul 26 10:39:53 weather2 pqact[11882]: Interrupt
>Jul 26 10:39:53 weather2 pqact[11882]: Exiting
>Jul 26 10:39:53 weather2 weather-01[11884]: Interrupt
>Jul 26 10:39:53 weather2 pqact[11885]: Interrupt
>Jul 26 10:39:53 weather2 weather-02[11886]: Interrupt
>Jul 26 10:39:53 weather2 weather-02[11886]: Exiting
>Jul 26 10:39:53 weather2 weather-03[11887]: Interrupt
>Jul 26 10:39:53 weather2 131.156.8.47[11888]: Interrupt
>Jul 26 10:39:53 weather2 flood-3[11889]: Interrupt
>Jul 26 10:39:53 weather2 flood-2[11890]: Interrupt
>Jul 26 10:39:53 weather2 weather[11891]: Interrupt
>Jul 26 10:39:53 weather2 weather-01[11884]: Exiting
>Jul 26 10:39:53 weather2 131.156.8.47[11888]: Exiting
>Jul 26 10:39:53 weather2 flood-3[11889]: Exiting
>Jul 26 10:39:53 weather2 weather-03[11887]: Exiting
>Jul 26 10:39:53 weather2 pqact[11885]: Exiting
>Jul 26 10:39:53 weather2 whistler(feed)[12043]: Interrupt
>Jul 26 10:39:54 weather2 rpc.ldmd[11879]: Terminating process group
>Jul 26 10:39:54 weather2 pqbinstats[11880]: Exiting
>
>What should I do besides utterly panic? :-) It won't stay up for more than
>7 hours. Increasing queue size to 500 MB didn't help. Neither did a Glibc
>patch that came out this week. Oh, and when I just type in "ldmadmin
>start"...it restarts fine, no queue corruption (ldmadmin queuecheck came
>back empty).
Other than building from source, I would try to put the LDM into
verbose logging at the 6 hour mark. Perhaps the verbose logging will
shed some more light on the group leader rpc.ldmd exit. You change
the LDM logging verbosity by sending the group leader rpc.ldmd
a USR1 signal. The first 'kill -USR1' ups the logging level to
verbose; the second to debug; the third goes back to silent.
>Thanks for any help!
Tom