This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Daryl, > I'm rolling with LDM 6.13.12 on RHEL 8.2 64bit and just discovered to my > horror that LDM killed off its pqact child a few weeks back, so I missed all > kinds of processing :( > > The ldmd.log simply has: > > 20201019T090611.164977Z ldmd[11302] ldmd.c:reap:131 > NOTE child 11305 terminated by signal 14: pqact etc/pqact_cod.conf > > > $ grep 11305 var/logs/ldmd.log.1 > 20201005T151354.111797Z pqact[11305] pqact.c:main:416 > NOTE Starting Up {cmd: "pqact etc/pqact_cod.conf"} > 20201005T151354.112168Z pqact[11305] pqact.c:main:586 > NOTE Starting from insertion-time 2020-10-05 15:13:44.656157 UTC > 20201019T090611.164977Z ldmd[11302] ldmd.c:reap:131 > NOTE child 11305 terminated by signal 14: pqact etc/pqact_cod.conf > > The only other logging around this time is: > > 20201012T131216.455128Z rtstats[11304] error.c:err_log:236 > WARN Couldn't connect to LDM on rtstats.unidata.ucar.edu using > either port 388 or po > rtmapper; : RPC: Remote system error - Connection refused > 20201012T131217.492879Z rtstats[11304] error.c:err_log:236 > WARN Couldn't connect to LDM on rtstats.unidata.ucar.edu using > either port 388 or po > rtmapper; : RPC: Remote system error - Connection refused > 20201019T090611.164977Z ldmd[11302] ldmd.c:reap:131 > NOTE child 11305 terminated by signal 14: pqact etc/pqact_cod.conf > 20201021T090514.845984Z idd.cod.edu[11307] error.c:err_log:236 > NOTE No heartbeat from upstream LDM for 300 seconds. Disconnecting. > 20201021T090514.851470Z idd.cod.edu[11307] requester6.c:req6_new:496 > NOTE LDM-6 desired product-class: 20201021080514.848966 TS_ENDT > {{EXP, "^cod "},{NONE, "SIG=153042095c0c0bd7d673c66ee1b63b87"}} > 20201021T090514.917249Z idd.cod.edu[11307] > requester6.c:make_request:222 NOTE Upstream LDM-6 on idd.cod.edu is > willing to be a primary feeder > 20201021T091015.021508Z idd.cod.edu[11307] error.c:err_log:236 > NOTE No heartbeat from upstream LDM for 300 seconds. Disconnecting. > 20201021T091015.021740Z idd.cod.edu[11307] requester6.c:req6_new:496 > NOTE LDM-6 desired product-class: 20201021081015.021609 TS_ENDT > {{EXP, "^cod "},{NONE, "SIG=153042095c0c0bd7d673c66ee1b63b87"}} > 20201021T091015.063496Z idd.cod.edu[11307] > requester6.c:make_request:222 NOTE Upstream LDM-6 on idd.cod.edu is > willing to be a primary feeder > 20201021T091925.810605Z idd.cod.edu[11307] error.c:err_log:236 > NOTE No heartbeat from upstream LDM for 300 seconds. Disconnecting. > 20201021T091925.810817Z idd.cod.edu[11307] requester6.c:req6_new:496 > NOTE LDM-6 desired product-class: 20201021081925.810754 TS_ENDT > {{EXP, "^cod "},{NONE, "SIG=ac0db31f508d41ba71763c661b535ef7"}} > 20201021T091925.860399Z idd.cod.edu[11307] > requester6.c:make_request:222 NOTE Upstream LDM-6 on idd.cod.edu is > willing to be a primary feeder > > Why would LDM send a signal 14 to its pqact!?!? The LDM didn't. The pqact(1) process sent the SIGALRM (signal 14) to itself due to a latent bug that's been in pqact(1) since its creation. The bug is very hard to trigger, which is why no one has seen it until recently. The latest version of the LDM shouldn't have this problem (I say "shouldn't" because the bug is *very* hard to trigger). Regards, Steve Emmerson Ticket Details =================== Ticket ID: AMR-414518 Department: Support LDM Priority: Normal Status: Closed =================== NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.