This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Dave, > Yeah, I might have been the one who originally encountered this > problem and asked you'all for help with it. You are (now that you reminded me -- it's truly been a while). > Apple's lack of response > has been disappointing. To say the least. > Maybe they'll be more responsive during this > period immediately following Snow Leopard's release, while they try to > get the bugs out of it. (Either that, or they'll be more overwhelmed > than usual with bug reports.) From your fingers to their eyes. :-) I suspect that there just aren't that many programs running on Mac OS X 10.4 (or higher) systems that use fcntl() file-locking and mmap() memory-mapping as much as the LDM. > Unidata's "known bug" entry for this problem notes that there is no > workaround. That's true in a sense, but if it were strictly true then > I'd never be able to stop the LDM at all, even when processes hang > (which eventually some of them do on a semi-regular basis). I think a hung downstream LDM will, nevertheless, terminate upon reception of a SIGTERM, which is what the top-level LDM server sends all child processes when it's told to terminate. I could be wrong, however. One thing I have noticed is that attaching to the hung process with gdb(1) and then exiting gdb(1) will free the process from its hung state. I'm at a loss to understand how that happens without intervention by the operating system. > I've > written scripts that try to deal with the inability to run "ldmadmin > stop" to stop the LDM; maybe you could comment on whether or not I've > got the bases covered acceptably: > > (1) Run "ldmadmin stop", redirecting the output to a file. > > (2) Check that file for the word "isn't" (as in "the LDM isnt > running", or something like that). > > (3a) If the LDM isn't running, check to see if there's a ldmd.pid > file. > -- If there's no ldmd.pid file, run pqcat & pqcheck, then run > "ldmadmin clean". > > (3b) If the LDM is running, wait 30 seconds to give it a chance to > shut down. (This is typically doomed, at least for some rpc processes.) Maybe you should give it a minute. > (4) Get a list of pids for rpc processes owned by the ldm account. > > (5) If this list isn't empty: > -- Run "kill -9" on them all. > -- Run "ldmadmin clean". > -- Run pqcat and pqcheck. If this doesn't produce a "tallied > consistent" message, run "ldmadmin delqueue" and "ldmadmin mkqueue". > > (6) Run "ldmadmin start". This procedure should result in a restarted LDM. I just wish it wasn't necessary. > (Script attached.) > > -- Dave Regards, Steve Emmerson Ticket Details =================== Ticket ID: USJ-914724 Department: Support LDM Priority: Normal Status: Closed