This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hiya, Anne asked if I would look at halo in her absence. Here's what I found last week: - CCNY is running an older version of the LDM, before the new queue structure. - The machine hangs because it is trying to receive "too" much data with the current configuration. - Actual abnormal LDM exits occurred when: - pqexpire is running and pbuf_flush messages are being emitted ( there must be some contention when pqexpire is trying to delete products and pqact is writing products, could it possibly be the same product? ) - pbuf_flush messages and RE-CLASS messages are being emitted. ( contention between pqact and the receiver process) These are speculations of course, but if halo could run the new version of the LDM, then the first action would be eliminated and possibly the second one too. This was the information was derived from the logs, I thought I saved the actual logs but I didn't. The only log information is short excert from after a RE-CLASS message where the LDM exits, I'll attach it. There were many pbuf_flush messages and RE-CLASS messages in the logs. It appears a slow disk problem and a narrow pipe problem exists, maybe the configurations need to be changed. Robb... =============================================================================== Robb Kambic Unidata Program Center Software Engineer III Univ. Corp for Atmospheric Research address@hidden WWW: http://www.unidata.ucar.edu/ ===============================================================================
Apr 08 13:14:58 halo.sci.ccny.cuny.edu redwood[20174]: FEEDME(redwood.atmos.alba ny.edu): reclass: 20010408130958.110 TS_ENDT {{IDS|HDS|DDPLUS, ".*"} Apr 08 13:14:58 halo.sci.ccny.cuny.edu redwood[20174]: assertion "pIf(xdrs->x_op == XDR_ENCODE, *cpp != NULL && **cpp != 0)" failed: file "ldm_xdr.c" Apr 08 13:15:04 halo.sci.ccny.cuny.edu rpc.ldmd[20168]: child 20174 terminated b y signal 6 Apr 08 13:15:04 halo.sci.ccny.cuny.edu rpc.ldmd[20168]: Killing (SIGINT) process group Apr 08 13:15:04 halo.sci.ccny.cuny.edu rpc.ldmd[20168]: Interrupt Apr 08 13:15:04 halo.sci.ccny.cuny.edu rpc.ldmd[20168]: Exiting Apr 08 13:15:05 halo.sci.ccny.cuny.edu 169.226.4.37[20176]: Interrupt Apr 08 13:15:05 halo.sci.ccny.cuny.edu pqact[20171]: Interrupt Apr 08 13:15:05 halo.sci.ccny.cuny.edu pqact[20171]: Exiting Apr 08 13:15:05 halo.sci.ccny.cuny.edu pqbinstats[20170]: Interrupt Apr 08 13:15:05 halo.sci.ccny.cuny.edu DCSYNOP[14581]: Interrupt Signal Apr 08 13:15:05 halo.sci.ccny.cuny.edu 169.226.4.37[20176]: Exiting Apr 08 13:15:05 halo.sci.ccny.cuny.edu pqexpire[20169]: Interrupt Apr 08 13:15:05 halo.sci.ccny.cuny.edu DCUAIR[14580]: Interrupt Signal Apr 08 13:15:05 halo.sci.ccny.cuny.edu 169.226.4.58[20179]: Interrupt Apr 08 13:15:05 halo.sci.ccny.cuny.edu DCSYNOP[14719]: Interrupt Signal Apr 08 13:15:05 halo.sci.ccny.cuny.edu 169.226.4.58[20179]: Exiting Apr 08 13:15:05 halo.sci.ccny.cuny.edu DCHRLY[14377]: Interrupt Signal