This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
"Arthur A. Person" wrote: > > > > > When you say "it still thrashes", do you mean that products aren't being > > received in a timely manner? Right now products on ldm.meteo appear to > > be arriving pretty quickly. And, 'top' is showing a low load average, > > the machine appears to be responsive, and there's a reasonable number of > > rpc.ldmds... Is this all with your 600Mb queue? > > By thrashing, I mean that the disk I/O light is mostly on and occasionally > blinks off and the system has very slow response and the IDD reception is > lagging at the reclass time limit but a "top" shows only a few percent of > cpu usage. The IDD seems fine on ldm right now because I restarted it > last night and also remade the queue to 600MB. This doesn't tell us > anything about the cause, but I'm beginning to suspect that it has > something to do with using a large queue. I'm going to run it with the > queue at 600MB until I leave for vacation next Friday... if it makes it > that long without a problem, I'll conclude it's queue size related and we > can resume working on this when we both get back from vacation. > > I still have my wsi data coming in, so if I don't see problems in the next > week, I'll probably assume the wsi rpc's are a symptom rather than a > cause, although they should still shut down when a connection is lost. > Art, FYI, Charlie O'Brian at WSI agreed to feed our 7.1 machine temporarily starting Monday. I'll request the WSI data then, and try it with various queue sizes. Also, he said: > Unless there is a problem (ie internet congestion, system crash, > client LDM stopping, etc), out program should never have to reconnect. > Our processes check every 5 minutes to make sure the client is > connected. I noticed that we did a lot of restarting thru 5z this > morning. I would hazzard to guess they are fine, now. Yesterday, from the piece of the log I ftp'ed from your site, there were 155 connections in about 12 hours. (And only 106 disconnects, as I recall.) Could connectivity be a factor? And yet, I'm assuming you had no similar problems when you were using navier, is that right? You could try going back to the 2Gb queue and see if the problem returns... Anne -- *************************************************** Anne Wilson UCAR Unidata Program address@hidden P.O. Box 3000 Boulder, CO 80307 ---------------------------------------------------- Unidata WWW server http://www.unidata.ucar.edu/ ****************************************************