This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Luis Cano wrote: > > Hello Anne, > > Thanks for the reply. > > The servers are new machines that will be brought into operations soon. > Couple of > weeks ago, we started with an older version of the ldm and had similar > problems. > Since then, the kernel on one of the servers was upgraded to 2.2.16, and the > ldm > was upgraded to 5.1.2 on both servers. I did not install the LDM, but I am > fairly > certain that it was a binary install. > > Also, this problem does not happen right away. The LDM will only run for a > couple of > days, less than a week. To recovery, we have to delete the queue and recreate > it. So > we have deleted the queue a couple of times since upgrading the ldm. > > I poked around in the Unidata mail archives and saw that this type of problem > was > reported previously. I did not save the email, but the email thread basically > considered this problem a kernel issue -- which could very well be the > problem. In > the email thread, the kernel was upgraded to 2.2.16 and the problem was > reported > fix. However, since the problem does not happen immediately and the exact > nature of > the problem is not known, I would that the ldm would need to run for a number > of > weeks before considered fixed. > > I'm in the process of upgrading the drivers, in attempt to eliminate driver > issues > that may be impacting the kernel. Also, I am going to recompile the ldm with > the -g > switch so I can analyze a dump if the ldm cores. > > Do you have any other suggestions? > > Thanks and appreciate the help. > > Lou > Hi Lou, Well, I'm kinda stumped on this one. Now I'm wondering about the RAID disk after all. In an earlier message I said I didn't know of any problems with a RAID disk. We have used a RAID disk successfully under Solaris, although we did not keep the queue there - only the data. I saw the email in the archives about the kernal upgrade (http://www.unidata.ucar.edu/cgi-bin/mfs/65/3878?96#mfs). I agree - it's not clear whether the problem was fixed for good or not, but since we did not hear back from them it may have fixed the problem. Thus, it seems to make sense to upgrade the OS and the drivers. What has happened since you upgraded the one to 2.2.16? From your previous email, I gather that both of your machines have only RAID disks, is that right? Thus, the product queue must be on a RAID disk. Something else to try would be to add a non-RAID disk and put the queue there, if possible. I wish I could be more helpful. Let me know what you find out. In the meantime I'll make some inquiries and let you know if I get any more ideas. Anne -- *************************************************** Anne Wilson UCAR Unidata Program address@hidden P.O. Box 3000 Boulder, CO 80307 ---------------------------------------------------- Unidata WWW server http://www.unidata.ucar.edu/ ****************************************************