This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Tom, > >The ldmd.log on shemp does show more of these sorts of messages than > >usual: > > > Oct 29 13:18:23 shemp.unidata.ucar.edu motherlode[7962]: pq_del_oldest: > conflict on 1785472176 > Oct 29 13:18:23 shemp.unidata.ucar.edu motherlode[7962]: comings: pqe_new: > Resource temporarily unavailable > Oct 29 13:18:23 shemp.unidata.ucar.edu motherlode[7962]: : 68c5b47e4b3 > a9c49959e63a570bb6127 21422 20001029131013.897 NMC2 174 /u/ftp/ga ... > > > >I don't recall seeing these at all during our earlier testing. I > >think what this means is that a receiver process can't get a lock on > >the oldest product in the queue to delete it to make space for an > incoming product, because some other process, probably a sender > >process, still has a lock on that region. If a sender died while it > >still had a lock on a product, it would never release it, so this > >might be a symptom of that. But later messages refer to a different > >region, so the lock must have gotten released. This may be a red > >herring, but I'll try to look at it more carefully to see exactly why > >it is occurring. > > Going with the concept of a slow feedee, I can offer the following: > I saw the machine navier from Penn State connecting to shemp. I also > recall seeing a message that navier was on a network that was having > problems. Perhaps the two go together? If so, the next question is > why is shemp feeding navier? The question after that is can I shut it > off and see if shemp returns to the land of the living? Thinking about it some more, I'll bet what has the lock on these oldest products is not a downstream sender but a McIDAS decoder. That would be consistent with the other symptoms, but it would mean that the above message doesn't help much, it's just another indication that it takes a long time for the McIDAS decoders to get the products to decode. Right now the only indication of the region with a lock on it in pq_del_oldest messages like Oct 29 13:18:23 shemp.unidata.ucar.edu motherlode[7962]: pq_del_oldest: conflict on 1785472176 is the region offset, 1785472176, which is not very useful to mere humans. I think I can improve this message to provide the product ID, so we would at least know which product was locked. If these all turned out to be McIDAS products, it might help in diagnosing the problem. To use the new debugging code, we would have to stop and restart the LDM on shemp. I'll see if this is practical by just changing the code in pq_del_oldest on shemp for now ... --Russ