[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[McIDAS #KDN-271049]: /home/data/mcidas/images isn't scouring, filling disk
- Subject: [McIDAS #KDN-271049]: /home/data/mcidas/images isn't scouring, filling disk
- Date: Mon, 27 Aug 2007 15:37:41 -0600
Hi Gilbert,
OK, I am just about done tweaking the setup on weather3...
re: FILEing of NEXRAD Level III products on weather
> Well the CPU load is incorporated into the overall load average, and
> that's what's critical.
I am suspicious that the negative effects you were seeing (e.g., high load
averages) may have been caused by having too many processing actions for
too many feeds in your ~ldm/etc/pqact.gempak file. The reason I say that is
the following observation on weather3:
1) the list of feeds for which there are actions in pqact.gempak is:
CMC|CONDUIT|FNEXRAD|FSL2|GPS|NNEXRAD|NGRID|NIMAGE|NLDN|NOGAPS|PCWS|UNIDATA|WSI
2) the number of actions in pqact.gempak is:
/home/ldm/etc% grep -v ^# pqact.gempak | grep -v ^" " | grep -v ^$ | wc -l
487
3) the Data Volume Summary page for weather3.admin.niu.edu is as follows:
http://www.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?weather3.admin.niu.edu
Data Volume Summary for weather3.admin.niu.edu
Maximum hourly volume 623.716 M bytes/hour
Average hourly volume 411.020 M bytes/hour
Average products per hour 49993 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
HDS 191.371 [ 46.560%] 376.626 18196.435
NEXRAD2 79.296 [ 19.293%] 129.694 4481.261
FNEXRAD 69.901 [ 17.007%] 88.611 70.674
NNEXRAD 24.180 [ 5.883%] 29.013 2054.261
UNIWISC 20.784 [ 5.057%] 32.314 23.826
IDS|DDPLUS 17.690 [ 4.304%] 21.914 25127.891
DIFAX 5.543 [ 1.349%] 22.425 6.957
FSL2 2.026 [ 0.493%] 2.156 21.848
NLDN 0.229 [ 0.056%] 0.577 9.696
This listing shows that there are about 49500 products each hour that are
checked for processing by the pqact that is handling the pqact.gempak
actions.
Before I changed the list of feeds that would be processed by the pqact that
is responsible for pqact.gempak actions, it would have to scan ALL products
in ALL feeds each hour, or almost 50000 products on average.
This means that that pqact has to do 49500 * 487 = 24106500 comparisons
each hour, and it acts on some fraction of these. NOTE that pqact does not
stop working its way through a pattern/action file when a match is found, it
continues looking for additional matches.
The amount of processing that this single pqact would have to do would lead
me to believe the following:
- it would likely fall behind in its processing if it was tasked with FILEing
all NEXRAD Level III products
- it should consume a lot of CPU
Now, splitting the actions into more pqact.conf files will help keep any one
pqact from falling behind in the processing it is attempting to do. It should
not, however, decrease the overall CPU use, in fact, it should increase it
over a shorter time interval.
So what's my point?
- Chiz added the ability to generate multiple pqact.conf files for GEMPAK
processing based on his observation that if one leaves all processing
in one pqact.conf file, then one might see the processing fall behind enough
so that products will not get processed out of the LDM queue before they are
overwritten by newly received ones.
- it may be the case that moving the NIMAGE processing to a pqact that is not
already overloaded would result in weather3's being able to process the
data without the very high load averages you experienced
re: I just logged into weather as 'mcidas' and:
- pointed at weather2 for RTNEXRAD data:
- removed the ADDE definitions for the RTNEXRAD dataset from the server
mapping table, $MCDATA/RESOLV.SRV
> OK, great. Thanks!
No worries.
re: weather3 is either a dual 3 Ghz machine or a single with hyper threading
> It's the latter, so is weather2. They're identical.
OK.
re: weather, on the other hand has a single 3 Ghz processor.
> Yep.
OK.
re: Given the hardware I see, I would think that weather would
struggle more than the other two machines
> Yes, and...
re: One of the biggest loads on any machine is X Windows -- it is a HUGE memory
user.
> Unfortunately, for WXP I have to use it.
Hmm... Can't you use a virtual frame buffer for generation of WXP products
for your web site?
re: processing of NEXRAD Level II data
> Correct, but the limited amount keeps the load from getting too high. I
> used to have all LEVEL2 data on weather3, and when I get the new machines,
> I will do so again.
OK.
re: I finished adjusting processing being done by McIDAS pqact.conf actions
to remove duplication of those being done for GEMPAK
> Good!
Yes, this will save disk AND CPU.
re: I propose that we investigate the high load averages seen when processing
NIMAGE data.
> I did a "yum -y install *iostat*" but didn't find any packages. Any clues?
Yup. I installed the package containing iostat as follows:
yum install sysstat-7.0.4-3.fc7
This installed /usr/bin/iostat and /usr/bin/sar. I then copied over the script
we use for system monitoring, ~ldm/util/uptime.tcl and adjusted some entries
to work on your system (like the PATH defined in uptime.tcl). I then added
running of the script once-per-minute from cron:
#
# Monitor system performance
#
* * * * * util/uptime.tcl logs/weather3.uptime
0 0 1 * * bin/newlog logs/weather3.uptime 12
The items listed are:
20070827.2121 0.51 1.00 1.27 10 18 28 7481 39M 6M 38.00 18.50
0.50 43.00
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
^ ^
| | | | | | | | | | | | |
| |_ %idle
| | | | | | | | | | | | |
|_ I/O wait
| | | | | | | | | | | | |_ %
system
| | | | | | | | | | | |_ %user
| | | | | | | | | | |_ swap in use
| | | | | | | | | |_ free memory
| | | | | | | | |_ age of oldest product in LDM
queue [s]
| | | | | | | |_ total # connections
| | | | | | |_ # upstream connections
| | | | | |_ # downstream connections
| | | | |_ 15 minute load average
| | | |_ 5 minute load average
| | |_ 1 minute load average
| |_ time [UTC]
|_ date [ccyymmdd]
The output from this file will give us a time history of the performance on
weather3.
I have adjusted things on the McIDAS ADDE side to use GEMPAK-processed images
where needed. I believe that the redundancy in processing/disk use between
GEMPAK and McIDAS is now gone.
> Great.
re: I think that weather3 should easily be able to handle the processing
load you have on it AND file the NIMAGE products. The fact that it can't
leads me to suspect that something is wrong somewhere. The thing to do is
find out where the problem(s) is(are) and fix it(them).
> OK.
I will turn on NIMAGE processing in the combined McIDAS pqact.conf file,
~ldm/etc/pqact.conf_mcidas to see what happens on weather3. I will write
the NIMAGE data into the directory structure needed for GEMPAK, but the
action now in pqact.gempak will be commented out.
re: I can't see how what you have right now in weather3 is not able to keep up
with what you are trying to do.
> Hmm. OK.
re: take care with overclocking
> I do it now, no problems so far, but I only go 5% over.
Yes, but you ran into a heat problem on weather...
> Gotta run...
More as the NIMAGE testing proceeds.
Cheers,
Tom
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: KDN-271049
Department: Support McIDAS
Priority: Normal
Status: Closed