This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
On Fri, 20 Apr 2001, anne wrote: > > Tom and Robb, > Halo looks about like it did before - RECLASS and skipped messages are > appearing in the log pretty regularly. I presume that is due to network > congestion - some products are arriving 1 to 2 minutes late. Although, > this is consistent with this other phenomenon I'm seeing that I described > to you last night - most of the products that are arriving there have > time stamps that are almost exactly an hour old. I assume the LDM uses Anne, The RE-CLASS messages I believe only occur if a product is being sent or received that is over an hour old. When I looked at the logs last week, my main concern was the number of pbuf_flush messages and that the time elapsed was high on the messages. So the machine is spending a great deal of time waiting to write data. This makes me think that the first problem to tackle is the disk I/O. I would check that the queue and where the data products are stored are both local disks. If that's the case, then check on the number of disks on halo. A possible solution would be to add another disk and have the queue on one and the data storage on another. Mike knows more about fine tuning systems. Another solution of course would be to request less data, nobody wants to do that. After the pbuf_flush problem is solved then I would look at the amount of data requested. My opinion is to solve the machine problem before checking the network. Robb... > strictly Z time, although the symptoms make me wonder about the spring > time change. Do you guys have any ideas about this? > > Also, halo is just sluggish even with load averages less than 1.0 and the > CPU idle a significant percentage of the time. The only thing I can > point to that might contribute to this is that, according to 'top', there > is only 128Mb of RAM. Perhaps the sluggishness is why there are still > significant numbers of pbuf_flush messages in the log. > > Maybe this is just the way it is for halo. Or, maybe Ward should add > another 128M of RAM. What do you think? > > Anne > > =============================================================================== Robb Kambic Unidata Program Center Software Engineer III Univ. Corp for Atmospheric Research address@hidden WWW: http://www.unidata.ucar.edu/ ===============================================================================