This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>To: address@hidden >From: Gregory Grosshans <address@hidden> >Subject: Characterization of data load on LDM and predicting impacts on LDM >queue load >Organization: NOAA/SPC >Keywords: 200111022252.fA2MqI102912 LDM performance Gregg, > Thanks for the response. We are using LDM 5.1.2. In regards to > pbuf_flush log messages, when should one become concerned about > them, if at all? You also mentioned Unidata will be watching > motherlode to see how it handles the increased radar data and if it > can't handle it you may farm out some of the data and processing to > another machine. How will you determine that motherlode can't > handle the increased data (e.g. corrupted gempak decoded files like > metar, a larger number of pbuf_flush messages, the load on the > system climbing high)? It may be that you are getting pbuf_flush messages as an artifact of trying to process very large products through pqact. The code in pqact that's writing the messages is just testing if it takes longer than 1 second to write a product, either to a file or to a pipe, and if so, it emits the message. I think this code was written back when no products were bigger than about 20Kbytes, so maybe the arbitrary 1 second threshold needs to be larger. I think the intent was just to indicate when pqact was falling behind, either because of slow writes or slow decoders. It looks like almost all you pbuf_flush messages would go away if that threshold were set to 10 seconds instead of 1 second. The relevant code is in pqact/pbuf.c, line 190: #ifdef INSTRUMENT gettimeofday(&afta, 0); diff = diff_timeval(&afta, &b4); if(diff.tv_sec > 1) { uerror("pbuf_flush %d: time elapsed %3ld.%06ld", buf->pfd, diff.tv_sec, diff.tv_usec ); } #endif It looks like another way to eliminate these messages from your log file would be to undefine the "INSTRUMENT" macro and recompile, but I haven't tested that. But I wouldn't worry too much about these messages; the products are getting filed/decoded, but it's taking a while. You might look for ways to put less load on pqact or the processes it calls by filing products in larger batches or optimizing the decoders you are using or doing some of the processing on a different machine. We would probably determine that motherlode couldn't handle the load by seeing product latencies climb uniformly to all downstream sites, or by noticing that pqact couldn't keep up with handling all the products in the queue in a timely fashion. Determining when a product is first inserted in the queue and how much later pqact finishes with it can be done with verbose logging. --Russ