This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Pete, re: > We just got a new machine to replace idd.aos.wisc.edu. The machine has 2 > 8-core opteron processors, 32 Gb of RAM, and two 300 Gb SAS disks. I > have it running Scientific Linux (redhat recompile) version 6. Do you have a reading of which version of Fedora your version of Scientific Linux would correspond to? (this question might make more sense as you read further) re: > The ldm queue is on a software raid 0 spanning both disks. > > The ldm built fine, and starts fine, but after ingesting data for some > time (usually 45 min to an hour) one ldmd process will peg to 100%, and > data stops being ingested and relayed. I can use ldmadmin stop to stop > the ldm, and it seems to stop and restart properly, but I don't see > anything in the logs that indicate what went wrong. > > Any ideas? Quite some time ago, I experimented with hardware and software RAIDs on Fedora Core 1 and 3 systems that I was putting together in my office. I found that the LDM performance when the queue was on the RAID was very poor... so poor, in fact, that the latencies for all feeds would slowly climb and eventually be at 3600 seconds. This behavior occurred in more or less the same way regardless of the RAID configuration (i.e., regardless of it being hardware or software RAID). I found that the LDM performance would improve dramatically (understatement) if the LDM queue was moved to a non-RAID file system. The question that comes to my mind is what would happen if you moved your LDM queue to a non-RAID file system? Here are the tests I would try: - first, try reducing the size of the queue to something like 2 GB Does the severe reduction in size eliminate the problem? - if the problem persists, I would try putting the queue on a non-RAID file system Does the problem disappear? I must add, however, that we are running our idd.unidata.ucar.edu cluster nodes with large (e.g., 12 or 20 GB) LDM queues on RAID-located file systems. The difference might be that we are running reasonably current versions (Fedora 12) of Linux on those nodes (hence the question about what Fedora release your Scientific Linux distribution corresponds to). re: > I can get you an account on the machine if you want to poke > around on it. OK, thanks. Just so you know, the problem could potentially come down to redoing the file system(s), and this will not be doable remotely. Cheers, Tom -- **************************************************************************** Unidata User Support UCAR Unidata Program (303) 497-8642 P.O. Box 3000 address@hidden Boulder, CO 80307 ---------------------------------------------------------------------------- Unidata HomePage http://www.unidata.ucar.edu **************************************************************************** Ticket Details =================== Ticket ID: TIB-337303 Department: Support LDM Priority: Normal Status: Closed