[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LDM at Riverside Fire Weather Office
- Subject: Re: LDM at Riverside Fire Weather Office
- Date: Tue, 23 Feb 1999 13:56:43 -0700 (MST)
Scott,
First I did not like the tone of your e-mail message, you are unclear and
not polite. If fact, I had permission not to correspond with you because
NWS is actually out of support arena. But, I did a curosity and got
hpfire1 to receive data. The problem was a period after the hostname,
output included in message. Also, I recommend that you update your LDM to
version 5.0.6 and install it according to the instructions on the Web
page. Make sure you get the latest pqact.c code for the AFOS patch and
look at the HP 10.2 binary directions for extra build instructions.
Robb...
hpfire1: 27 % ldmping -i 5 -h hackey.sac.noaa.gov.
Feb 23 20:28:43 State Elapsed Port Remote_Host
rpc_stat
Feb 23 20:28:43 NAMED 0.003104 0 hackey.sac.noaa.gov.
gethostbyname
(hackey.sac.noaa.gov.): lookup failed
Feb 23 20:28:48 NAMED 0.000967 0 hackey.sac.noaa.gov.
gethostbyname
(hackey.sac.noaa.gov.): lookup failed
Feb 23 20:28:53 NAMED 0.000973 0 hackey.sac.noaa.gov.
gethostbyname
(hackey.sac.noaa.gov.): lookup failed
Feb 23 20:28:59 NAMED 0.000798 0 hackey.sac.noaa.gov.
gethostbyname
(hackey.sac.noaa.gov.): lookup failed
Feb 23 20:29:04 NAMED 0.000788 0 hackey.sac.noaa.gov.
gethostbyname
(hackey.sac.noaa.gov.): lookup failed
Feb 23 20:29:09 NAMED 0.000786 0 hackey.sac.noaa.gov.
gethostbyname
(hackey.sac.noaa.gov.): lookup failed
hpfire1: 28 % ldmping -i 5 -h hackey.sac.noaa.gov
Feb 23 20:29:49 State Elapsed Port Remote_Host
rpc_stat
Feb 23 20:29:49 RESPONDING 0.072713 388 hackey.sac.noaa.gov
Feb 23 20:29:54 RESPONDING 0.031828 388 hackey.sac.noaa.gov
Feb 23 20:29:59 RESPONDING 0.031827 388 hackey.sac.noaa.gov
hpfire1: 35 % !!
ldmadmin watch
(Type ^D or ^C when finished)
Feb 23 20:39:52 pqutil: 108 19990223204025.847 AFOS 28833
SFOMTRUKI (
SXUS90 KWBC 232030)
Feb 23 20:39:52 pqutil: 133 19990223204026.445 AFOS 28834
BOIMTRMUO (
SXUS90 KWBC 232035)
Feb 23 20:39:53 pqutil: 132 19990223204026.661 AFOS 28835
FAIMTRBTI (
SXUS90 KWBC 232035)
Feb 23 20:39:56 pqutil: 147 19990223204029.883 AFOS 28836
WVRBOYPN1 (
SNVD17 CWVR 231900)
Feb 23 20:39:56 pqutil: 147 19990223204030.139 AFOS 28837
WVRBOYPN1 (
SNVD17 CWVR 231900)
Feb 23 20:39:57 pqutil: 105 19990223204030.740 AFOS 28838
IEETAF11 (F
CEE32 LOWM 231900 RRC)
Feb 23 20:39:57 pqutil: 115 19990223204030.949 AFOS 28839
NMCPRCCO (U
BCO90 KWBC 232039)
On Mon, 22 Feb 1999, Scott Cunningham wrote:
> >
> > Scott,
> >
> > According to top the machine appears to be cpu bound, maybe even a run
> > away netscape process 1957. Also, I notice that there are many pqact
> > entries with FILE -overwrite -strip parts. These cause high
> > disk usage. It looks like some gempak decoders were running too.
>
> This machine (hackey.sac.noaa.gov) has been running in this configuration for
> years
> with no problems and it still has no problems with ldm or the decoders or the
> disk I/O or any pqact entries.
>
> ***THE PROBLEM IS WITH THE RPC.LDMD CONNECTION TO 166.5.202.192!!!! *********
>
>
> > Also you are feeding downstream sites.
>
> YES!!! One IS being fed successfully (166.2.43.144) and has been for over
> three
> months now. The one that isn't is why I have written to you about the
> problem!!!!!!!!
> Certainly there can't be a limit of one downstream site - Salt Lake City is
> feeding
> 24 sites!!!
>
> > I would comment out the
> > pqact entry in the ldmd.conf and try running the system.
>
> No, we can't do that, we use this data 24 hours a day to support our
> operations.
>
> Also, it was
> > impossible to search the syslog.log over the network. There are
> > instructions on creating ldmd.log files, at:
> >
> > http://www.unidata.ucar.edu/packages/ldm/ldmPreInstallList.html#s8
> >
> > It's much easier to debug a system that been set up according to the ldmd
> > conventions.
> As mentioned earlier, this configuration is being run successfully in at least
> 20 other Western Region NWS offices besides ours.
> >
> > When you remedy these problems then lets see if the ldm exits?
>
> System: hackey Mon Feb 22 12:05:55 1999
> Load averages: 1.13, 0.85, 1.28
> 163 processes: 160 sleeping, 3 running
> Cpu states:
> LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
> 1.13 20.7% 0.0% 18.7% 60.6% 0.0% 0.0% 0.0% 0.0%
>
> Memory: 68636K (39608K) real, 84752K (49028K) virtual, 89124K free Page# 1/12
>
> TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
> ? 25392 scunn 192 20 1424K 1460K run 0:04 10.88 10.86 dtwm
> ? 25419 scunn 185 20 504K 896K run 0:06 5.72 5.71 dtpad
> ? 11346 daemon 154 20 18580K 16808K sleep 12:34 4.20 4.20 X
>
> The runaway netscapes have been killed and the CPU is now happy and not
> terribly busy at all. The Riverside machine is still not connected.
>
> I have spent MANY hours troubleshooting this problem and compiling the
> information/output
> that was sent in the first e-mail. Despite this I feel you either do not
> understand the problem
> or feel that I have haphhazardly installed ldm. I have about 4 years
> experience working with this
> software and have never required support from UNIDATA before. If the nature
> of the problem is still
> not clear to you, please reread my correspondences or call me to clear it up.
> Thanks...
>
> Scott Cunningham
> >
> > Robb...
> >
> > System: hackey Fri Feb 19 15:29:57
> > 1999
> > Load averages: 1.60, 1.57, 1.70
> > 153 processes: 151 sleeping, 2 running
> > Cpu states:
> > LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
> > 1.60 54.9% 0.0% 45.1% 0.0% 0.0% 0.0% 0.0% 0.0%
> >
> > Memory: 91824K (65840K) real, 115800K (73860K) virtual, 70464K free Page#
> > 1/11
> >
> > TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
> > ? 1957 rhart 236 20 24032K 12704K run 6079:30 90.20 90.04
> > netscape
> > ? 15331 gempak 154 20 260K 240K sleep 0:20 1.87 1.87 pqing
> > ? 1764 rhart 154 20 25368K 13444K sleep 3:48 0.48 0.48
> > netscape
> > ? 16198 gempak 178 20 284K 292K run 0:00 0.37 0.37 top
> > ? 15055 scunn 154 20 432K 884K sleep 0:06 0.34 0.34 dtterm
> > ? 362 root 154 20 72K 164K sleep 31:45 0.21 0.21 syncer
> > ? 1487 gempak 168 22 64K 244K sleep 20:37 0.18 0.18
> > ftp_loop
> > ? 3 root 128 20 0K 0K sleep 14:10 0.13 0.13
> > statdaemon
> > ? 15330 gempak 168 20 328K 308K sleep 0:01 0.11 0.11 pqact
> > ? 820 root 154 20 6076K 1364K sleep 3:56 0.07 0.07 rpcd
> > ? 10104 gempak 168 20 2280K 1568K sleep 2:24 0.06 0.06
> > nag_watch.pl
> > ? 15054 scunn 154 20 1596K 1460K sleep 0:01 0.05 0.05 dtfile
> > ? 574 daemon 154 20 26996K 25300K sleep 19:50 0.04 0.04 X
> >
> >
> > ===============================================================================
> > Robb Kambic Unidata Program Center
> > Software Engineer III Univ. Corp for Atmospheric
> > Research
> > address@hidden WWW: http://www.unidata.ucar.edu/
> > ===============================================================================
> >
> >
>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> | Scott Cunningham National Weather Service |
> | (916) 979-3041 x224 3310 El Camino Ave. #228 |
> | (916) 979-3052 FAX Sacramento, CA 95821-6340 |
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
===============================================================================
Robb Kambic Unidata Program Center
Software Engineer III Univ. Corp for Atmospheric Research
address@hidden WWW: http://www.unidata.ucar.edu/
===============================================================================