[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Support #SBB-325304]: Re: 20110712: CONDUIT request -- fire weather grids
- Subject: [Support #SBB-325304]: Re: 20110712: CONDUIT request -- fire weather grids
- Date: Thu, 25 Aug 2011 15:48:20 -0600
Hi Justin and Becky,
re: latest metrics.txt file(s)
> The latest version is available at:
> http://www.ftp.ncep.noaa.gov/data1/nccf/com/tmp/metrics.txt
Thanks for putting the metrics.txt file out for me to grab. The information it
contains
is _very_ useful.
Now the good news:
- plots of the information in the metrics.txt file show that things are working
nicely on ncep-ldm2.woc.noaa.gov:
- load averages are nice and low especially after you found and killed
the stuck 'gribinsert' process
- the smallest age of the oldest product in the LDM queue did not become
smaller after the new parallel NAM/fire weather products were added
to the test CONDUIT mix
This is very important since it shows that there was no serious impact
created when the additional data was added.
NB: this did not create any problems because the parallel NAM data
was (and continues to be) added at off peak times for the "operational"
CONDUIT feed. If the timing of the addition of the parallel NAM data
changes, problems could arise!
- the plot of the number of bytes in the LDM queue vs time shows that
the queue is volume limited. The change from being slots limited
to being volume limited coincides with the mounting of all needed
file systems so that 'gribinsert' had access to model output files
that needed to be carved-up into individual products which would then
be added to the LDM queue.
Being volume limited is _not_ a problem for the queue. It shows,
however, that addition of more volume at the "wrong" (existing peak
times) could affect overall performance by lowering the smallest
age of the oldest product in the queue further.
- the trace of the age of the oldest product in the queue shows that
there has been little/now effect on the smallest age, but there has
been a decrease in the largest age (from 3.5 hours to ~2.7 hours).
This is simply another way of saying that there is more data flowing
through the queue now.
- the number of products in the LDM queue with time trace shows a
noticeable growth
This is a reflection of all file systems being now correctly
mounted and model output on those file systems being processed
into the feed (a good thing).
The volume plots from the real-time statistics for ncep-ldm2.woc.noaa.gov
when compared against the volume plots for one of the "operational"
CONDUIT top level machines, ncep-ldm0.woc.noaa.gov, graphically shows
the large increase in data in the test CONDUIT datastream:
ncep-ldm0.woc.noaa.gov
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?CONDUIT+ncep-ldm0.woc.noaa.gov
ncep-ldm2.woc.noaa.gov
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?CONDUIT+ncep-ldm2.woc.noaa.gov
Put each plot up in a browser tab and flip back and forth between the plots, and
you will instantly see the large increase in volume and new bi-modality in the
test CONDUIT feed. A quantitative listing of this difference can be easily
gotten from the cumulative volume listings for each of the machines above:
ncep-ldm0.woc.noaa.gov
http://www.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?ncep-ldm0.woc.noaa.gov
Data Volume Summary for ncep-ldm0.woc.noaa.gov
Maximum hourly volume 6612.373 M bytes/hour
Average hourly volume 2429.549 M bytes/hour
Average products per hour 64028 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
CONDUIT 2429.549 [100.000%] 6612.373 64028.304
ncep-ldm2.woc.noaa.gov
http://www.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?ncep-ldm2.woc.noaa.gov
Data Volume Summary for ncep-ldm2.woc.noaa.gov
Maximum hourly volume 6612.534 M bytes/hour
Average hourly volume 3258.420 M bytes/hour
Average products per hour 70480 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
CONDUIT 3258.420 [100.000%] 6612.534 70479.957
What I am most interested in is a comparison of the average volume from each
feed:
ncep-ldm2.woc.noaa.gov: 2429.549 M bytes/hour
ncep-ldm2.woc.noaa.gov: 3258.420 M bytes/hour
The increase in average increase (which is over a two day period) is on the
order
of 800 MB/hour. The maximum hourly increase in volume is up to 3.2 GB!
Needless to say, this is _not_ an insignificant increase in data volume, AND it
is something that will need to be explained and OKed by the current CONDUIT
community.
The good news is that there is enough information in the Product IDs for the
fire weather products for the end-users to opt out of getting the data (i.e.,
not REQUEST the data in one's ~ldm/etc/ldmd.conf file) and/or not process the
data
(i.e., not process the data in an ~ldm/etc/pqact.conf pattern-action file).
Example Product ID for a fire weather product:
data/nccf/com/nam/para/nam.20110825/nam.t18z.firewxnest.hiresf12.tm00.grib2
!grib2/ncep/NMM_89/#000/201108251800F012/WTMP/0 - NONE! 001001
Wrap-up comments:
- the test is proceeding nicely
- the test machine is, from all appearances, not showing any detrimental
effects of the increase in volume
- the additions I made to the GEMPAK tables seems to be sufficient for
creation of descriptive LDM/IDD Product IDs for all fire weather
products.
I am still checking to see if more additions need to be made for
GRIB/GRIB2 messages in NOAAPort.
The question is what do we do next?
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: SBB-325304
Department: Support CONDUIT
Priority: Normal
Status: Closed