This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Art, > To: address@hidden > From: "Arthur A. Person" <address@hidden> > Subject: rpc.ldmd signal 11's > Organization: Penn State University > Keywords: 200408311254.i7VCsh8E018445 The above message contained the following: > I've seen two cases (on two separate systems) where an rpc.ldmd process > has died with signal 11 killing LDM data collection. Both are running LDM > V6.0.15 on RedHat EL 3 update 2 kernel 2.4.21-15.0.4.ELsmp and fully > patched. This is new. I'll see if I can reproduce that behavior here. > The process did not core dump. The operating system must be told to allow a core-dump by the LDM user. This is usually done via the command ulimit -c unlimited Before executing the "ldmadmin start" command, verify that core-dumps are allowed via the command ulimit -c and use the previous command if they're not. > Here's an excerpt of the ldmd.log files for the most recent: > > Aug 26 13:58:49 ls2 rpc.ldmd[3634]: Starting Up (version: 6.0.15; built: > Jul 14 2004 15:25:10) > > Aug 26 13:58:49 ls2 pqact[3637]: Starting Up > Aug 26 13:58:49 ls2 pqact[3638]: Starting Up > Aug 26 13:58:49 ls2 pqact[3639]: Starting Up > Aug 26 13:58:49 ls2 pqbinstats[3635]: Starting Up (3634) > Aug 26 13:58:49 ls2 pqact[3641]: Starting Up > Aug 26 13:58:49 ls2 pqact[3640]: Starting Up > Aug 26 13:58:49 ls2 ldm[3645]: Starting Up(6.0.15): ldm.meteo.psu.edu: > TS_ZERO TS_ENDT {{ANY, > ".*"}} > Aug 26 13:58:49 ls2 ldm[3645]: Desired product class: 20040826135844.784 > TS_ENDT {{ANY, ".*"} > } > Aug 26 13:58:49 ls2 pqsurf[3643]: Starting Up (3634) > Aug 26 13:58:49 ls2 rtstats[3644]: Starting Up (3634) > Aug 26 13:58:49 ls2 pqact[3646]: Starting Up > Aug 26 13:58:50 ls2 ldm[3645]: Connected to upstream LDM-6 > Aug 26 13:58:51 ls2 ldm[3645]: Upstream LDM is willing to feed > Aug 26 14:00:06 ls2 pnga2area[4203]: Starting Up > Aug 26 14:00:06 ls2 pnga2area[4203]: unPNG:: 115626 309200 2.6741 > Aug 26 14:00:06 ls2 pnga2area[4203]: Exiting > Aug 26 14:00:50 ls2 pnga2area[4780]: Starting Up > Aug 26 14:00:50 ls2 pnga2area[4780]: unPNG:: 856353 4506096 5.2620 > Aug 26 14:00:50 ls2 pnga2area[4780]: Exiting > Aug 26 14:00:52 ls2 pnga2area[4819]: Starting Up > Aug 26 14:00:52 ls2 pnga2area[4819]: unPNG:: 1067122 4506096 4.2227 > Aug 26 14:00:52 ls2 pnga2area[4819]: Exiting > . > . > . > Aug 28 05:33:04 ls2 pnga2area[30968]: Starting Up > Aug 28 05:33:04 ls2 pnga2area[30968]: unPNG:: 90094 242720 2.6941 > Aug 28 05:33:04 ls2 pnga2area[30968]: Exiting > Aug 28 05:34:03 ls2 pnga2area[31478]: Starting Up > Aug 28 05:34:03 ls2 pnga2area[31478]: unPNG:: 74544 242720 3.2561 > Aug 28 05:34:03 ls2 pnga2area[31478]: Exiting > Aug 28 05:35:17 ls2 rpc.ldmd[3634]: child 3645 terminated by signal 11 > Aug 28 05:35:17 ls2 rpc.ldmd[3634]: Killing (SIGINT) process group > Aug 28 05:35:17 ls2 pqact[3637]: Interrupt > Aug 28 05:35:17 ls2 pqbinstats[3635]: Interrupt > Aug 28 05:35:17 ls2 pqact[3637]: Exiting > Aug 28 05:35:17 ls2 pqact[3638]: Interrupt > Aug 28 05:35:17 ls2 pqact[3638]: Exiting > Aug 28 05:35:17 ls2 pqact[3639]: Interrupt > Aug 28 05:35:17 ls2 rtstats[3644]: Interrupt > Aug 28 05:35:17 ls2 rpc.ldmd[3634]: SIGINT > Aug 28 05:35:17 ls2 pqact[3646]: Interrupt > Aug 28 05:35:17 ls2 pqsurf[3643]: Interrupt > Aug 28 05:35:17 ls2 pqact[3641]: Interrupt > Aug 28 05:35:17 ls2 pqact[3639]: Exiting > Aug 28 05:35:17 ls2 pqact[3646]: Exiting > Aug 28 05:35:17 ls2 pqact[3640]: Interrupt > Aug 28 05:35:17 ls2 pqact[3640]: Exiting > Aug 28 05:35:17 ls2 pqbinstats[3635]: Exiting > Aug 28 05:35:17 ls2 rtstats[3644]: Exiting > Aug 28 05:35:17 ls2 pqsurf[3643]: Exiting > Aug 28 05:35:17 ls2 pqact[3641]: Exiting > Aug 28 05:35:17 ls2 pqsurf[3643]: Queue usage (bytes):10682240 > Aug 28 05:35:17 ls2 pqsurf[3643]: (nregions): 54034 > Aug 28 05:35:17 ls2 pqsurf[3643]: Number of products 86836 > Aug 28 05:35:17 ls2 pqsurf[3643]: Number of observations 381591 > Aug 28 05:35:17 ls2 pqsurf[3643]: Number of dups 51657 > Aug 28 05:35:17 ls2 rpc.ldmd[3634]: Terminating process group > > Any ideas what might be causing this, and/or what I might do to capture > more/better information to track it down? Unless the LDM server is built with the "-g" (debugging) option, the core-dump will be of limited utility. If you don't mind, doing the following would help greatly: 1. Go to the top-level source-directory. 2. Execute the command "make distclean". 3. Set the environment variables CFLAGS and CPPFLAGS to "-g" and "-DNDEBUG", respectively (without the quotes). 4. Execute the following commands in order: make ldmadmin stop 5. Become the superuser. 6. Execute the following commands in order: make server/install_setuids ulimit -c unlimited ldmadmin start 7. Cross your fingers. :-) Your help in this would be greatly appreciated. > Thanks. > > Art. > > Arthur A. Person > Research Assistant, System Administrator > Penn State Department of Meteorology > email: address@hidden, phone: 814-863-1563 Regards, Steve Emmerson > NOTE: All email exchanges with Unidata User Support are recorded in the > Unidata inquiry tracking system and then made publically available > through the web. If you do not want to have your interactions made > available in this way, you must let us know in each email you send to us.