This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: Gilbert Sebenste <address@hidden> >Organization: NIU >Keywords: 200207261442.g6QEgK915724 LDM 5.2 RedHat 7.3 Linux Gilbert, >I need help with my ldm 5.2 on weather2.admin.niu.edu. I'm running it on a >Dell Dimension P3, running RedHat Linux 7.3, all patches on (see >http://www.redhat.com/errata/rh73-errata.html. My LDM mysteriously dies, >without warning, with these messages. If you havn't done so yet, I would try building 5.2 from source on weather2. In fact, when I run into LDM problems with a binary installation this is always the first thing I do. >Note: Weather.admin and >weather3.admin LDM's stay running fine (and both are on >5.2!), though weather2 is the machine where almost everything is ingested: > >Jul 26 10:37:16 weather2 pnga2area[2898]: unPNG:: 1518538 2422256 1.5951 >Jul 26 10:37:16 weather2 pnga2area[2898]: Exiting >Jul 26 10:39:53 weather2 rpc.ldmd[11879]: child 11883 terminated by signal 11 >Jul 26 10:39:53 weather2 rpc.ldmd[11879]: Killing (SIGINT) process group >Jul 26 10:39:53 weather2 rpc.ldmd[11879]: Interrupt >Jul 26 10:39:53 weather2 rpc.ldmd[11879]: Exiting >Jul 26 10:39:53 weather2 pqbinstats[11880]: Interrupt >Jul 26 10:39:53 weather2 pqact[11882]: Interrupt >Jul 26 10:39:53 weather2 pqact[11882]: Exiting >Jul 26 10:39:53 weather2 weather-01[11884]: Interrupt >Jul 26 10:39:53 weather2 pqact[11885]: Interrupt >Jul 26 10:39:53 weather2 weather-02[11886]: Interrupt >Jul 26 10:39:53 weather2 weather-02[11886]: Exiting >Jul 26 10:39:53 weather2 weather-03[11887]: Interrupt >Jul 26 10:39:53 weather2 131.156.8.47[11888]: Interrupt >Jul 26 10:39:53 weather2 flood-3[11889]: Interrupt >Jul 26 10:39:53 weather2 flood-2[11890]: Interrupt >Jul 26 10:39:53 weather2 weather[11891]: Interrupt >Jul 26 10:39:53 weather2 weather-01[11884]: Exiting >Jul 26 10:39:53 weather2 131.156.8.47[11888]: Exiting >Jul 26 10:39:53 weather2 flood-3[11889]: Exiting >Jul 26 10:39:53 weather2 weather-03[11887]: Exiting >Jul 26 10:39:53 weather2 pqact[11885]: Exiting >Jul 26 10:39:53 weather2 whistler(feed)[12043]: Interrupt >Jul 26 10:39:54 weather2 rpc.ldmd[11879]: Terminating process group >Jul 26 10:39:54 weather2 pqbinstats[11880]: Exiting > >What should I do besides utterly panic? :-) It won't stay up for more than >7 hours. Increasing queue size to 500 MB didn't help. Neither did a Glibc >patch that came out this week. Oh, and when I just type in "ldmadmin >start"...it restarts fine, no queue corruption (ldmadmin queuecheck came >back empty). Other than building from source, I would try to put the LDM into verbose logging at the 6 hour mark. Perhaps the verbose logging will shed some more light on the group leader rpc.ldmd exit. You change the LDM logging verbosity by sending the group leader rpc.ldmd a USR1 signal. The first 'kill -USR1' ups the logging level to verbose; the second to debug; the third goes back to silent. >Thanks for any help! Tom