[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20030911: LDM question (cont.)
- Subject: 20030911: LDM question (cont.)
- Date: Thu, 11 Sep 2003 12:10:12 -0600
>From: "Mark J. Laufersweiler" <address@hidden>
>Organization: OU
>Keywords: 200309111605.h8BG5kLd002485 LDM rpc.ldmd split feed
Mark,
>No CONDUIT data yet. Wanting to do that is the big reason for the
>upgrades in hardware and part of the reason for determining loads.
OK, makes sense.
>We have two new dual proc AMD Opterons with a ide2scsi raid. One is
>for the standard feeds and the second is for radar. Since we need to
>decode all nids for all radars and want to also bring in the level2
>data, we figured a seperate machine would be nice.
Another good idea.
>Right now:
>
> netstat | grep stokes.unidata-ldm | wc
> 28 168 2212
>
>with
>
>ps aux | grep ldm | wc
> 56 715 4571
>
>where 56 ranges between 49-60 usually, depending on the time, etc
>etc.
We instrument our machines with a Tcl script that farms a number bits
of information and saves it into a log file. Here is an annotated snippit
form the log file on thelma.ucar.edu:
1 2 3 4 5 6 7 8 9 10 11 12 13
20030911.1736 13.75 8.72 7.40 125 23 148 9610 4560M 461M 1 1
20030911.1737 7.55 8.00 7.23 125 23 148 9670 4560M 461M 1 1
20030911.1738 8.63 8.18 7.35 125 23 148 9732 4560M 461M 1 1
20030911.1739 13.47 9.28 7.77 125 23 148 9572 4559M 462M 1 1
20030911.1740 13.90 10.06 8.14 125 23 148 9633 4560M 461M 1 1
20030911.1741 20.14 12.18 9.01 125 23 148 9692 4560M 461M 0 1
Field Meaning
1 CCYYMMDD - date
2 HHMM - time (UTC)
3 ave1 - 1 minute load average
4 ave5 - 5 minute load average
5 ave15 - 15 minute load average
6 nfeed - number of downstream feed rpc.ldmds
7 nreceive - number of upstream request rpc.ldmds
8 nconnect - total number of rpc.ldmds
9 nsec - age of oldest product in queue
10 memfree - amount of free memory
11 swapused - swap space used
12 #wait - number of process in WAIT state
13 #rtstats - number of connections to rtstats.unidata.ucar.edu
The script is run out of cron every minute, so it provides a nice
history of the performance of the machine. Would you like to run the
same script on your machine(s)? The script has to be tweeked for
the OS it runs on, but we have a version we are running on our FreeBSD
LDM box. If you are interested, I put the script in the pub/ldm/scripts/freebsd
directory of anonymous FTP on our FTP server, ftp.unidata.ucar.edu.
You will have to find out where on your system tclsh exists (if it does)
and alter the first line of the script. The crontab entry we have
for running the script on our FreeBSD machine is:
#
# Log the system usage minute by minute
* * * * * util/uptime.tcl logs/newshemp.uptime
I would change this to:
#
# Log the system usage minute by minute
* * * * * util/uptime.tcl logs/stokes.uptime
for stokes.
>But with the desire to decode CONDUIT and the load that will come
>from that, we need to decide which machine will handle the
>processing.
Right.
>As a side, we will build one machine with Redhat9 and the second
>with FreeBSD. SMP does not always work with FreeBSD, but FreeBSD
>seems committed to AMD chipsets. We will see.
We are running FreeBSD 4.8 on a dual Athlon 2400+ machine with an SMP
kernel, and it seems to be performing well while ingesting all
CRAFT and CONDUIT data (no relay operation, however).
Tom