[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20011008: IDD latencies at PSU (cont.)



"Arthur A. Person" wrote:
> 
> Anne,
> 
> On Mon, 8 Oct 2001, Anne Wilson wrote:
> 
> > Unidata Support wrote:
> > >
> > > >
> > > > Our latencies have quickly popped back up to about 55 minutes now from
> > > > motherlode.  SSEC NEXRAD is still running within seconds.  I'm working
> > > > with our networking people now to see if they can discover anything.
> > > >
> > > >                                      Art.
> >
> > Hi Art,
> >
> > How are things looking now?
> 
> I had to step out of the office yesterday afternoon.  Before that, I
> decided to upgrade stuff on our ldm machine to eliminate any other
> possibile problems.  I've upgraded our machine to Linux kernel 2.4.3-12
> (RedHat 7.1), upgraded the ldm to 5.1.4, and rebuilt the queue.  Looking
> through our logs, data is still slow from motherlode (30-60 minutes)
> while, for comparison, NEXRAD data from sunshine.ssec.wisc.edu is within a
> minute or less.  Where do we go from here?
> 
>                                   Thanks.
> 
>                                     Art.
> 
> >
> Arthur A. Person
> Research Assistant, System Administrator
> Penn State Department of Meteorology
> email:  address@hidden, phone:  814-863-1563

Good Morning, Art,

I don't know what to say at this point.  motherlode is running on time. 
This morning, traceroutes to your site look fine at least until they
reach your campus, after which I can't tell.  This is similar to what I
was seeing yesterday:

(anne) imogene:/home/anne/ldm/doc 63 % traceroute ldm.meteo.psu.edu
traceroute to ldm.meteo.psu.edu (128.118.28.12), 30 hops max, 38 byte
packets
 1  flra-n140 (128.117.140.252)  2.345 ms  0.254 ms  0.222 ms
 2  vbnsr-n2.ucar.edu (128.117.2.252)  1.020 ms  0.708 ms  0.667 ms
 3  internetr-n243-104.ucar.edu (128.117.243.106)  0.837 ms  0.815 ms 
0.996 ms
 4  denv-abilene.ucar.edu (128.117.243.126)  1.975 ms  1.892 ms  2.549
ms
 5  kscy-dnvr.abilene.ucaid.edu (198.32.8.14)  13.050 ms  12.476 ms 
12.641 ms
 6  ipls-kscy.abilene.ucaid.edu (198.32.8.6)  21.746 ms  22.121 ms 
21.942 ms
 7  clev-ipls.abilene.ucaid.edu (198.32.8.26)  28.188 ms  30.909 ms 
28.183 ms
 8  abilene.psc.net (192.88.115.122)  31.353 ms  34.767 ms  36.039 ms
 9  penn-state.psc.net (198.32.224.66)  36.348 ms  35.070 ms  35.083 ms
10  * * *

Traceroutes to the other sites that were having trouble last week look
great.  At this point I'm suspecting something on your campus network,
although I know that you spoke to those people yesterday.

From the logs, it does look like motherlode had more trouble than usual
maintaining a connection to your site yesterday, especially at 16Z -
17Z.  The log entries show connections being reestablished.  Before
that, a few 'time elapsed' messages show products having trouble getting
through, but not an inordinate amount:

motherlode.ucar.edu% grep psu ldmd.log*
ldmd.log.1:Oct 08 01:05:54 motherlode.ucar.edu ldm[17997]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 15:18:30 motherlode.ucar.edu ldm(feed)[17997]:
h_clnt_call: ldm.meteo.psu.edu: 1: time elapsed  21.810677
ldmd.log.1:Oct 08 16:27:25 motherlode.ucar.edu ldm[1513]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 16:40:00 motherlode.ucar.edu ldm[3060]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 16:49:02 motherlode.ucar.edu ldm[3902]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 16:54:12 motherlode.ucar.edu ldm[4354]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 17:06:29 motherlode.ucar.edu ldm[5750]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 17:11:21 motherlode.ucar.edu ldm[6220]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 17:21:44 motherlode.ucar.edu ldm[7441]: Connection
from ldm.meteo.psu.edu
ldmd.log.1:Oct 08 18:17:56 motherlode.ucar.edu ldm[29708]: Connection
from ldm.meteo.psu.edu
ldmd.log.2:Oct 07 20:50:36 motherlode.ucar.edu ldm(feed)[1224]:
h_clnt_call: ldm.meteo.psu.edu: BLKDATA: time elapsed  29.586863
ldmd.log.3:Oct 06 11:37:54 motherlode.ucar.edu ldm(feed)[1224]:
h_clnt_call: ldm.meteo.psu.edu: BLKDATA: time elapsed  27.726462
ldmd.log.3:Oct 06 12:20:11 motherlode.ucar.edu ldm(feed)[1224]:
h_clnt_call: ldm.meteo.psu.edu: COMINGSOON: time elapsed  24.870132
ldmd.log.3:Oct 06 15:38:01 motherlode.ucar.edu ldm(feed)[1224]:
h_clnt_call: ldm.meteo.psu.edu: BLKDATA: time elapsed  31.245140
ldmd.log.3:Oct 06 17:38:24 motherlode.ucar.edu ldm(feed)[1224]:
h_clnt_call: ldm.meteo.psu.edu: 1: time elapsed  29.320011
ldmd.log.4:Oct 05 14:58:22 motherlode.ucar.edu ldm[14054]: Connection
from ldm.meteo.psu.edu
ldmd.log.4:Oct 05 15:17:29 motherlode.ucar.edu ldm(feed)[14054]:
h_clnt_call: ldm.meteo.psu.edu: BLKDATA: time elapsed  21.641506
ldmd.log.4:Oct 05 17:25:05 motherlode.ucar.edu ldm[1224]: Connection
from ldm.meteo.psu.edu

However, there are no log entries in today's log, which started at 23Z
on the 8th, 16 hours ago now.  If there were connectivity problems
yesterday that are now gone, I'd think that it might have taken you a
while to catch up, but not 16 hours.

I'll have to think on this a bit and will need some time.  I started
'netcheck' which will sample the connection to your machine over time. 
Are your downstream sites complaining? 

Btw, is the clock accurate on your machine?

Anne
-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************